• 滑雪

    周六上午从虹桥飞崇礼。四人一起。

    打开到太舞酒店,吃泡面。

    晚上打车去万龙夜场。滑到关门。钻网从云顶回来。太舞小镇吃羊蝎子火锅。

    周日顶门,滑到下午关门。。八个人一起去镇上吃饭。然后逛雪具店。

    周一继续,滑了一上午。吃个粉。然后去高铁转飞机回上海。

  • Reload Tmux Config

    1

    按下 Prefix 键(默认是 Ctrl+b),然后按下 : 键,进入命令输入模式。

    在命令行中输入以下内容并回车:

    set -g mouse off

    2

    按下 Prefix 键(默认是 Ctrl+b),然后按下 : 键,进入命令输入模式。

    在命令行中输入以下内容并回车:

    source-file ~/.tmux.conf

    您可以通过观察 tmux 的行为或运行以下命令来确认配置是否已生效:

    tmux show-options -g

  • vllm-openllm-run-http-server

    流水账 周二

    玩了一下匹克球,运动量还可以,出了一身汗。

    用 Vllm 和 llamafactory-cli api 各自起了一下http 服务。拿 dify 试了一下,能不能用。

  • 点亮屏幕

    离开一段时间之后,再打开电脑,其中一个屏幕不亮。

    发现怎么让它亮了

    把鼠标盲移过去,移到另外一个不亮的屏幕

    感觉就激活了

    本来是不亮的,移过去就亮了。有时候会卡住一两秒,再继续移

  • 流水帐-周一

    在Dify上面添加了Rerank的步骤。但是关键点其实还是搜索匹配的文档:需要准而且全。

  • 做梦

    昨天想到,已经很久没有做过梦了,不知道是没有做,还是醒了就不记得了。

    今天早晨醒来,发现又是没有做梦。

    有些可惜,之前每天都做梦的,觉得很有趣,就像是多活了一次。

    然后,突然想起来,昨天晚上其实做梦了,梦里的情节一下子出来了。

    还好还好,还会做梦。

  • Docker To Podman Compose

    dockeer/docker-compose 切换到 podman/podman-compose 之后,compose 起不来了。

    中间各种查日志和资料不说,一个可能的原因是这样的(带猜测性质):

    是docker-compose 里面,如果不配置network,会使用default这个network。 但换到podman之后,没有default了(有一个叫podman的),所以就找不到network了。

    解决方法是在 docker-compose.yml 里面,每个 service 都加上network的配置。

    networks:

    • podman

    如果之前配置的 default networkd,换成 podman

  • Podman Compose Missing Networks

    Q: macos上,podman-compose启动失败,报错 “RuntimeError: missing networks: default”

    A: 根据搜索结果,Podman Compose 在某些情况下无法自动创建默认网络,需要在 docker-compose.yml 文件中显式定义默认网络。例如:

    version: '3.8'
    services:
      your_service:
        image: your_image
        networks:
          - default
    
    networks:
      default:
        driver: bridge
    
  • Vector Search

    https://medium.com/@vidiptvashist/building-a-vector-database-from-scratch-in-python-6bd683ba5171

    关键还是如何做 embedding。以及如何做相似度计算?

    后面看一下有哪些常用的embedding model 和算法。

    还有,代码中都是简单的字符串,如果特别长,怎么切片呢?这个参数是不是也重要?

    from typing import Any
    
    import numpy as np
    
    
    class VectorStore:
        def __init__(self):
            self.vector_data: dict[str, np.ndarray] = {}  # A dictionary to store vectors
            self.vector_index: dict[str, dict] = {}  # An indexing structure for retrieval
    
        def add_vector(self, vector_id: str, vector: np.ndarray):
            """
            Add a vector to the store.
    
            Args:
                vector_id (str or int): A unique identifier for the vector.
                vector (numpy.ndarray): The vector data to be stored.
            """
            self.vector_data[vector_id] = vector
            self._update_index(vector_id, vector)
    
        def get_vector(self, vector_id):
            """
            Retrieve a vector from the store.
    
            Args:
                vector_id (str or int): The identifier of the vector to retrieve.
    
            Returns:
                numpy.ndarray: The vector data if found, or None if not found.
            """
            return self.vector_data.get(vector_id)
    
        def _update_index(self, vector_id, vector):
            """
            Update the index with the new vector.
    
            Args:
                vector_id (str or int): The identifier of the vector.
                vector (numpy.ndarray): The vector data.
            """
            # In this simple example, we use brute-force cosine similarity for indexing
            for existing_id, existing_vector in self.vector_data.items():
                similarity = np.dot(vector, existing_vector) / (
                    np.linalg.norm(vector) * np.linalg.norm(existing_vector)
                )
                if existing_id not in self.vector_index:
                    self.vector_index[existing_id] = {}
                self.vector_index[existing_id][vector_id] = similarity
    
        def find_similar_vectors(self, query_vector, num_results=5):
            """
            Find similar vectors to the query vector using brute-force search.
    
            Args:
                query_vector (numpy.ndarray): The query vector for similarity search.
                num_results (int): The number of similar vectors to return.
    
            Returns:
                list: A list of (vector_id, similarity_score) tuples for the most similar vectors.
            """
            results: list[tuple[str, float]] = []
            for vector_id, vector in self.vector_data.items():
                similarity = np.dot(query_vector, vector) / (
                    np.linalg.norm(query_vector) * np.linalg.norm(vector)
                )
                results.append((vector_id, similarity))
    
            # Sort by similarity in descending order
            results.sort(key=lambda x: x[1], reverse=True)
    
            # Return the top N results
            return results[:num_results]
    
    
    # Establish a VectorStore instance
    vector_store = VectorStore()  # Creating an instance of the VectorStore class
    
    # Define sentences
    sentences = [  # Defining a list of example sentences
        "I eat mango",
        "mango is my favorite fruit",
        "mango, apple, oranges are fruits",
        "fruits are good for health",
    ]
    
    # Tokenization and Vocabulary Creation
    vocabulary: set[str] = set()  # Initializing an empty set to store unique words
    for sentence in sentences:  # Iterating over each sentence in the list
        tokens = (
            sentence.lower().split()
        )  # Tokenizing the sentence by splitting on whitespace and converting to lowercase
        vocabulary.update(tokens)  # Updating the set of vocabulary with unique tokens
    
    # Assign unique indices to vocabulary words
    word_to_index = {
        word: i for i, word in enumerate(vocabulary)
    }  # Creating a dictionary mapping words to unique indices
    
    # Vectorization
    
    # Initializing an empty dictionary to store sentence vectors
    sentence_vectors: dict[str, np.ndarray] = {}
    for sentence in sentences:  # Iterating over each sentence in the list
        tokens = (
            sentence.lower().split()
        )  # Tokenizing the sentence by splitting on whitespace and converting to lowercase
        vector = np.zeros(
            len(vocabulary)
        )  # Initializing a numpy array of zeros for the sentence vector
        for token in tokens:  # Iterating over each token in the sentence
            vector[
                word_to_index[token]
            ] += 1  # Incrementing the count of the token in the vector
        sentence_vectors[sentence] = (
            vector  # Storing the vector for the sentence in the dictionary
        )
    
    # Store in VectorStore
    for sentence, vector in sentence_vectors.items():  # Iterating over each sentence vector
        vector_store.add_vector(
            sentence, vector
        )  # Adding the sentence vector to the VectorStore
    
    # Similarity Search
    query_sentence = "Mango is the best fruit"  # Defining a query sentence
    query_vector = np.zeros(
        len(vocabulary)
    )  # Initializing a numpy array of zeros for the query vector
    query_tokens = (
        query_sentence.lower().split()
    )  # Tokenizing the query sentence and converting to lowercase
    for token in query_tokens:  # Iterating over each token in the query sentence
        if token in word_to_index:  # Checking if the token is present in the vocabulary
            query_vector[
                word_to_index[token]
            ] += 1  # Incrementing the count of the token in the query vector
    
    similar_sentences = vector_store.find_similar_vectors(
        query_vector, num_results=2
    )  # Finding similar sentences
    
    # Display similar sentences
    print("Query Sentence:", query_sentence)  # Printing the query sentence
    print("Similar Sentences:")  # Printing the header for similar sentences
    for (
        sentence,
        similarity,
    ) in similar_sentences:  # Iterating over each similar sentence and its similarity score
        print(
            f"{sentence}: Similarity = {similarity:.4f}"
        )  # Printing the similar sentence and its similarity score
    
  • 这不是有效沟通

    A: Deepseek 为什么成本这么低? B:我不懂。 A:你不懂?你们搞技术的应该都懂啊。 B:我不懂。 A:那你懂啥?