Yunfeng's Simple Blog

git lfs pointer 报错解决

发表于 2025-06-02 更新于 2025-08-03 阅读次数：

1. 问题说明

在git管理中，有时候会遇到下面的报错：

1 2	Encountered 1 file that should have been a pointer, but wasn't: /path/to/file

或像下面这样：

1
2
3

Encountered <n> files that should have been pointers, but weren't:
    /path/to/file1
    /path/to/file2

调查一番后发现，这种报错的核心原因是本应该用Git LFS管理的文件，直接被git来管理了。报错中提到的pointer ，实际上指的就是Git LFS 格式的文件，它不包含完整的数据，而只是一个指向完整数据的指针。

阅读全文 »

bitnet-b1.58-2b-4t

发表于 2025-04-20 更新于 2025-08-03 阅读次数：

1. 说明

看到最近为微软更新了一个BitNet新版本bitnet-b1.58-2B-4T，参数只采用{-1, 0, 1}来表示，在普通CPU上性能挺好，而网络上测试结果不多，因此这里试试看到底效果怎么样。

这里尝试了2种方式，一个是利用官方提供的网站，进行效果测试，另一个是下载模型自己本地搭建C++推理环境，两种情况都只在CPU上进行测试。整体测试下来，两种情况的效果都还是不能让人满意，有一点智商但不高，速度倒确实挺快的。不过随着模型不断迭代，或许未来能走出跟云端GPU环境部署的模型完全不一样的道路。

阅读全文 »

Neovim conceal机制导致markdown语法隐藏的问题

发表于 2025-02-15 阅读次数：

conceal 是 Vim/Neovim 中一个用于优化显示效果的机制，它可以将某些语法符号替换为更简洁的视觉表示（或完全隐藏）。这在 Markdown、LaTeX 等格式中常用于提升可读性，但有一个问题：不太好确定自己的markdown标签是否写完了，因此markdown文件可能最后少了一个```，渲染结果出错。而对于我来说，我不需要在编辑器里面看到最终的渲染效果，所以这个功能完全可以去掉。

为了确定是插件导致的还是Neovim自带的功能，我用下面的命令打开不启用所有插件的Neovim:

1	nvim -u NORC /path/to/file

结果markdown渲染正常，因此确认问题是由某个插件引入的。

然后可以通过一个个启用插件的方式，来验证是哪个插件导致的问题，但由于插件太多，太麻烦，我问了DeepSeek R1，验证下面的语句可以解决这个问题：

1 2	let g:vim_markdown_conceal = 0 let g:vim_markdown_conceal_code_blocks = 0

也有人说是indentLine设置导致的，但没有再验证。

也是通过这次搜索，了解了conceal这个术语。

Quotation Armin Ronacher's Reflecting on Life

发表于 2025-02-15 阅读次数：

Whether it’s working on a project, solving a difficult problem, or even refining soft skills like communication, the act of showing up and putting in the hours is essential. Practice makes perfect, but more so it’s all about progress rather than perfection. Each hour you spend iterating, refining, failing and retrying brings you closer to excellence. It doesn’t always feel that way in the moment but when you look back at what you did before, you will see your progress. And that act of looking back, and seeing how you improved, is immensely rewarding and in turn makes you enjoy your work.

Armin Ronacher’s Reflecting on Life

GitPod简单使用说明

发表于 2025-02-15 阅读次数：

GitPod是一个云端开发IDE，可以访问gitpod.io，绑定GitHub账号后打开GitHub上的任意项目，也可以通过安装浏览器插件，直接在GitHub网站打开IDE。

GitPod打开后默认是个VS Code在线环境，有一台国外的容器可以使用，机器配置如下：

CPU: 16核，型号AMD EPYC 7B13
内存：64G
存储：30G

由于它的服务器在国外，因此可以快速下载GitHub, Google Drive或Hugging Face上的一些模型，然后用Python开一个简单的网页服务(python -m http.server)，再在本地用wget下载模型，速度还可以。

GitPod主打的一个点是快速启动开发环境，可以通过在https://gitpod.io/user/preferences 设置中指定dotfile来设置启动环境

这个dotfiles仓库可以保存你常用的rc文件等，保证熟悉的环境能够快速上手，例如我将自己的常用配置放到https://github.com/vra/dotfiles，开机就能用上熟悉的开发环境了。

总之，GitPod可以作为一个免费的临时服务器和在线IDE，偶尔用用还不错。

poetrystrands

发表于 2025-02-15 阅读次数：

发现一个挺有意思的古诗词连线网站PoetryStrands，网站简洁风，有空可以玩一玩，复习古诗词。这种简单创意网站值得被记录。

as-a-junior-engineer

发表于 2025-02-15 阅读次数：

As a junior engineer, there’s simply no substitute for getting the first 100K lines of code under your belt. The “start over each day” method will help get you to those 100K lines faster.You might think covering the same ground multiple times isn’t as valuable as getting 100K diverse lines of code. I disagree. Solving the same problem repeatedly is actually really beneficial for retaining knowledge of patterns you figure out.You only need 5K perfect lines to see all the major patterns once. The other 95K lines are repetition to rewire your neurons.

Algorithms we develop software by,很有同感的一段话，很多事情只有不断重复才能真正掌握它，例如走路，会走一次，不能算学会走路，只有不断地走，直到忽略你在走路这个事实之后，才算真正地学会了走路。

博客新计划

发表于 2025-02-15 阅读次数：

AI技术日新月异，能用AI做的事情越来越多。

作为一个普通人，知识和技能唾手可得，记忆性的东西不再重要，而独特的思维方式则是你区别于别人的重要标签。在这样的时代背景下，每个人越来越需要独立思考的能力，因此每个自己的独特想法、见解都值得被记录下来。

而作为一个blogger，也许在未来（或现在?)，利用你的博客内容，AI可以重建你的思考方式，针对每一个新的事件，AI会给出你的评价，然后在跟自己真实的看法进行对照，是不是很有意思呢？。

基于上面的思考，我决定事无巨细地在这个博客中更新自己的技术内容，包括看到的技术内容引用，简单的comments，尝试新东西的过程，阅读技术代码的历程，造轮子的步骤，等等，总之就是不论大小，一概记录，相信当内容积攒越来越多后，基于这个博客的语料数据，结合我编写的代码，AI能够准确地重建一个我的程序员分身，这样未来也许我就不需要写代码了哈哈。

TransformerEncoder导出onnx问题解决

发表于 2025-01-29 更新于 2025-02-15 阅读次数：

1. 问题说明

在使用Pytorch的TransformerEncoder时，导出onnx会将时序长度固定，导致没法采用变长输入，例如下面的简单例子复现了这个问题：

import torch
import torch.nn as nn


class SimpleTransformer(nn.Module):
    def __init__(self, input_dim=512, num_layers=6, nhead=8):
        super().__init__()
        # 创建Transformer编码器层
        encoder_layer = nn.TransformerEncoderLayer(
            d_model=input_dim,
            nhead=nhead,
            dim_feedforward=2048,
            dropout=0.1,
            activation="relu",
            batch_first=True,  # 使用batch_first格式
        )

        # 创建Transformer编码器
        self.transformer_encoder = nn.TransformerEncoder(
            encoder_layer, num_layers=num_layers
        )

    def forward(self, x):
        # 输入形状: (batch_size, seq_len, input_dim)
        x = self.input_proj(x)
        output = self.transformer_encoder(x)
        return output


# 实例化模型
model = SimpleTransformer(input_dim=512, num_layers=2, nhead=8)
model.eval()  # 设置为评估模式

# 创建示例输入（batch_size=2, seq_len=10, input_dim=512）
dummy_input = torch.randn(2, 10, 512)

# 导出ONNX模型
torch.onnx.export(
    model,
    (dummy_input,),
    "transformer_encoder.onnx",
    do_constant_folding=True,  # 优化常量折叠
    input_names=["input"],  # 输入节点名称
    output_names=["output"],  # 输出节点名称
    dynamo=True,
)

print("ONNX model exported successfully!")

# 验证导出的模型
import onnxruntime as ort
import numpy as np

dummy_input2 = torch.randn(2, 11, 512)
ort_session = ort.InferenceSession("transformer_encoder.onnx")
outputs = ort_session.run(
    None,
    {"input": dummy_input2.numpy()}
)
print("ONNX output shape:", outputs[0].shape)

导出onnx时采用的时序长度是10，验证时采用时序长度11，运行时会报错：

2025-01-29 14:17:25.266794 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Reshape node. Name:'/transformer_encoder/layers.0/self_attn/Reshape_4' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:47 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape &, onnxruntime::TensorShapeVector &, bool) input_shape_size == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{11,2,512}, requested shape:{10,16,64}

Traceback (most recent call last):
  File "/Users/ws/export.py", line 63, in <module>
    outputs = ort_session.run(
              ^^^^^^^^^^^^^^^^
  File "/Users/ws/miniforge3/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 266, in run
    return self._sess.run(output_names, input_feed, run_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/transformer_encoder/layers.0/self_attn/Reshape_4' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:47 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape &, onnxruntime::TensorShapeVector &, bool) input_shape_size == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{11,2,512}, requested shape:{10,16,64}

尝试了Pytorch 2+ 提供的TorchDynamo-based ONNX Exporter（torch.onnx.export增加dynamo=True参数），也是同样的报错。

2. 如何解决

这个问题在Pytorch的GitHub 上有几个issue都在讨论，并且也给出了解决方案，不过不知道为什么官方一直没有集成修复代码。

修复方式也比较简单，修改torch/nn.functional.py中的两行代码即可。具体操作如下。

首先定位到当前python环境的functional.py的路径，采用下面的一行命令即可：

1	python -c "import torch, os; print(os.path.join(os.path.dirname(torch.__file__), 'nn', 'functional.py'))"

然后打开这个文件，搜索k = k.view(k.shape[0，只有一处匹配，大概在6200行，内容是：

1	k = k.view(k.shape[0], bsz * num_heads, head_dim).transpose(0, 1)

可用看到这里调用了k.shape[0]，在导出onnx时被固定了。将这一句修改为

1	k = k.view(-1, bsz * num_heads, head_dim).transpose(0, 1)

同样的，搜索v = v.view(v.shape[0]，也只有一处匹配，紧接着上面的代码，原始内容：

1	v = v.view(v.shape[0], bsz * num_heads, head_dim).transpose(0, 1)

修改为

1	v = v.view(-1, bsz * num_heads, head_dim).transpose(0, 1)

保存文件，再运行上面导出和验证onnx的脚本，一切正常了。

这种方式需要修改Pytorch源码，还是不太方便的，换一个环境，换一个机器，都得操作一遍，希望官方早日解决这个问题。

3. 相关Issues

Python lru_cache 使用与源码解读

发表于 2025-01-29 更新于 2025-02-15 阅读次数：

1. 用法说明

functools.cache和functools.lru_cache都是Python标准库functools模块提供的装饰器，用于缓存函数的计算结果，以提高函数的执行效率。

举一个简单的例子：

from functools import lru_cache
import timeit

@lru_cache
def factorial(n):
    return n * factorial(n-1) if n else 1

execution_time1 = timeit.timeit("factorial(64)", globals=globals(), number=10000)
execution_time2 = timeit.timeit("factorial.__wrapped__(64)", globals=globals(), number=10000)

print(f"Execution time1: {execution_time1:.4f} seconds")
print(f"Execution time2: {execution_time2:.4f} seconds")
print(f"Speedup: {execution_time2/execution_time1:.4f} times")

其中__wrapped__ 表示装饰器中原始的函数，也就是没有作用装饰器之前的裸函数。

代码输出如下：

1
2
3

Execution time1: 0.0004 seconds
Execution time2: 0.0016 seconds
Speedup: 3.5078 times

可以看到，通过lru_cache保存factorial函数的中间结果，得到了3.5倍的加速。
通过这里例子，我们可以看到lru_cache的使用方式，也是比较简单：

import lru_cache:: from functoools import lru_cache
给函数添加@lru_cache装饰器。

通过查看源码，可以看到lru_cache函数签名如下：

1	def lru_cache(maxsize=128, typed=False):

其中maxsize 参数表示缓存的最多结果数，默认是128。如果计算结果超过128，则遵循Least-recently-used (LRU)原则，将最近使用次数最少的缓存结果替换为当前的结果。如果设置maxsize=None，则缓存无上限，但内存占用也可能会增大，使用时多观察。

typed参数表示是否按类型缓存不同变量，即使数值一样。例如typed=True，那么f(decimal.Decimal("3.0")) 和 f(3.0)也会分开缓存。

阅读全文 »