每天多一点

11 Mar, 2025
滑雪
周六上午从虹桥飞崇礼。四人一起。

打开到太舞酒店，吃泡面。

晚上打车去万龙夜场。滑到关门。钻网从云顶回来。太舞小镇吃羊蝎子火锅。

周日顶门，滑到下午关门。。八个人一起去镇上吃饭。然后逛雪具店。

周一继续，滑了一上午。吃个粉。然后去高铁转飞机回上海。
7 Mar, 2025
Reload Tmux Config
1

按下 Prefix 键（默认是 Ctrl+b），然后按下 : 键，进入命令输入模式。

在命令行中输入以下内容并回车：

set -g mouse off

2

按下 Prefix 键（默认是 Ctrl+b），然后按下 : 键，进入命令输入模式。

在命令行中输入以下内容并回车：

source-file ~/.tmux.conf

您可以通过观察 tmux 的行为或运行以下命令来确认配置是否已生效：

tmux show-options -g
5 Mar, 2025
vllm-openllm-run-http-server
流水账周二

玩了一下匹克球，运动量还可以，出了一身汗。

用 Vllm 和 llamafactory-cli api 各自起了一下http 服务。拿 dify 试了一下，能不能用。
4 Mar, 2025
点亮屏幕
离开一段时间之后，再打开电脑，其中一个屏幕不亮。

发现怎么让它亮了

把鼠标盲移过去，移到另外一个不亮的屏幕

感觉就激活了

本来是不亮的，移过去就亮了。有时候会卡住一两秒，再继续移
4 Mar, 2025
流水帐-周一
在Dify上面添加了Rerank的步骤。但是关键点其实还是搜索匹配的文档：需要准而且全。
28 Feb, 2025
做梦
昨天想到，已经很久没有做过梦了，不知道是没有做，还是醒了就不记得了。

今天早晨醒来，发现又是没有做梦。

有些可惜，之前每天都做梦的，觉得很有趣，就像是多活了一次。

然后，突然想起来，昨天晚上其实做梦了，梦里的情节一下子出来了。

还好还好，还会做梦。
26 Feb, 2025
Docker To Podman Compose
dockeer/docker-compose 切换到 podman/podman-compose 之后，compose 起不来了。

中间各种查日志和资料不说，一个可能的原因是这样的（带猜测性质）：

是docker-compose 里面，如果不配置network，会使用default这个network。但换到podman之后，没有default了（有一个叫podman的)，所以就找不到network了。

解决方法是在 docker-compose.yml 里面，每个 service 都加上network的配置。

networks:
- podman
如果之前配置的 default networkd，换成 podman
25 Feb, 2025
Podman Compose Missing Networks
Q: macos上，podman-compose启动失败，报错 “RuntimeError: missing networks: default”

A: 根据搜索结果，Podman Compose 在某些情况下无法自动创建默认网络，需要在 docker-compose.yml 文件中显式定义默认网络。例如：
```
version: '3.8'
services:
  your_service:
    image: your_image
    networks:
      - default

networks:
  default:
    driver: bridge
```

21 Feb, 2025

Vector Search

https://medium.com/@vidiptvashist/building-a-vector-database-from-scratch-in-python-6bd683ba5171

关键还是如何做 embedding。以及如何做相似度计算？

后面看一下有哪些常用的embedding model 和算法。

还有，代码中都是简单的字符串，如果特别长，怎么切片呢？这个参数是不是也重要？

from typing import Any

import numpy as np


class VectorStore:
    def __init__(self):
        self.vector_data: dict[str, np.ndarray] = {}  # A dictionary to store vectors
        self.vector_index: dict[str, dict] = {}  # An indexing structure for retrieval

    def add_vector(self, vector_id: str, vector: np.ndarray):
        """
        Add a vector to the store.

        Args:
            vector_id (str or int): A unique identifier for the vector.
            vector (numpy.ndarray): The vector data to be stored.
        """
        self.vector_data[vector_id] = vector
        self._update_index(vector_id, vector)

    def get_vector(self, vector_id):
        """
        Retrieve a vector from the store.

        Args:
            vector_id (str or int): The identifier of the vector to retrieve.

        Returns:
            numpy.ndarray: The vector data if found, or None if not found.
        """
        return self.vector_data.get(vector_id)

    def _update_index(self, vector_id, vector):
        """
        Update the index with the new vector.

        Args:
            vector_id (str or int): The identifier of the vector.
            vector (numpy.ndarray): The vector data.
        """
        # In this simple example, we use brute-force cosine similarity for indexing
        for existing_id, existing_vector in self.vector_data.items():
            similarity = np.dot(vector, existing_vector) / (
                np.linalg.norm(vector) * np.linalg.norm(existing_vector)
            )
            if existing_id not in self.vector_index:
                self.vector_index[existing_id] = {}
            self.vector_index[existing_id][vector_id] = similarity

    def find_similar_vectors(self, query_vector, num_results=5):
        """
        Find similar vectors to the query vector using brute-force search.

        Args:
            query_vector (numpy.ndarray): The query vector for similarity search.
            num_results (int): The number of similar vectors to return.

        Returns:
            list: A list of (vector_id, similarity_score) tuples for the most similar vectors.
        """
        results: list[tuple[str, float]] = []
        for vector_id, vector in self.vector_data.items():
            similarity = np.dot(query_vector, vector) / (
                np.linalg.norm(query_vector) * np.linalg.norm(vector)
            )
            results.append((vector_id, similarity))

        # Sort by similarity in descending order
        results.sort(key=lambda x: x[1], reverse=True)

        # Return the top N results
        return results[:num_results]


# Establish a VectorStore instance
vector_store = VectorStore()  # Creating an instance of the VectorStore class

# Define sentences
sentences = [  # Defining a list of example sentences
    "I eat mango",
    "mango is my favorite fruit",
    "mango, apple, oranges are fruits",
    "fruits are good for health",
]

# Tokenization and Vocabulary Creation
vocabulary: set[str] = set()  # Initializing an empty set to store unique words
for sentence in sentences:  # Iterating over each sentence in the list
    tokens = (
        sentence.lower().split()
    )  # Tokenizing the sentence by splitting on whitespace and converting to lowercase
    vocabulary.update(tokens)  # Updating the set of vocabulary with unique tokens

# Assign unique indices to vocabulary words
word_to_index = {
    word: i for i, word in enumerate(vocabulary)
}  # Creating a dictionary mapping words to unique indices

# Vectorization

# Initializing an empty dictionary to store sentence vectors
sentence_vectors: dict[str, np.ndarray] = {}
for sentence in sentences:  # Iterating over each sentence in the list
    tokens = (
        sentence.lower().split()
    )  # Tokenizing the sentence by splitting on whitespace and converting to lowercase
    vector = np.zeros(
        len(vocabulary)
    )  # Initializing a numpy array of zeros for the sentence vector
    for token in tokens:  # Iterating over each token in the sentence
        vector[
            word_to_index[token]
        ] += 1  # Incrementing the count of the token in the vector
    sentence_vectors[sentence] = (
        vector  # Storing the vector for the sentence in the dictionary
    )

# Store in VectorStore
for sentence, vector in sentence_vectors.items():  # Iterating over each sentence vector
    vector_store.add_vector(
        sentence, vector
    )  # Adding the sentence vector to the VectorStore

# Similarity Search
query_sentence = "Mango is the best fruit"  # Defining a query sentence
query_vector = np.zeros(
    len(vocabulary)
)  # Initializing a numpy array of zeros for the query vector
query_tokens = (
    query_sentence.lower().split()
)  # Tokenizing the query sentence and converting to lowercase
for token in query_tokens:  # Iterating over each token in the query sentence
    if token in word_to_index:  # Checking if the token is present in the vocabulary
        query_vector[
            word_to_index[token]
        ] += 1  # Incrementing the count of the token in the query vector

similar_sentences = vector_store.find_similar_vectors(
    query_vector, num_results=2
)  # Finding similar sentences

# Display similar sentences
print("Query Sentence:", query_sentence)  # Printing the query sentence
print("Similar Sentences:")  # Printing the header for similar sentences
for (
    sentence,
    similarity,
) in similar_sentences:  # Iterating over each similar sentence and its similarity score
    print(
        f"{sentence}: Similarity = {similarity:.4f}"
    )  # Printing the similar sentence and its similarity score

14 Feb, 2025
这不是有效沟通
A: Deepseek 为什么成本这么低？ B：我不懂。 A：你不懂？你们搞技术的应该都懂啊。 B：我不懂。 A：那你懂啥？