Discussing the article: "Integrate Your Own LLM into EA (Part 1): Hardware and Environment Deployment"

 

Check out the new article: Integrate Your Own LLM into EA (Part 1): Hardware and Environment Deployment.

With the rapid development of artificial intelligence today, language models (LLMs) are an important part of artificial intelligence, so we should think about how to integrate powerful LLMs into our algorithmic trading. For most people, it is difficult to fine-tune these powerful models according to their needs, deploy them locally, and then apply them to algorithmic trading. This series of articles will take a step-by-step approach to achieve this goal.

When deploying LLMs locally, hardware configuration is a very important part. Here we mainly discuss mainstream PCs, and do not discuss MacOS and other niche products.

The products used to deploy LLMs mainly involve CPU, GPU, memory, and storage devices. Among them, the CPU and GPU are the main computing devices for running models, and memory and storage devices are used to store models and data.

The correct hardware configuration can not only ensure the running efficiency of the model but also affect the performance of the model to a certain extent. Therefore, we need to choose the appropriate hardware configuration according to our needs and budget.

Author: Yuqiang Pan

 

On hardware and OS only general words, benchmark of desktop vids, but mobile processors, abstract, not applicable to the task.

It feels like the article was generated by an AI.

 

I wonder if LLM can be converted to ONNX and how much it would weigh :)

it seems possible

RWKV-4 weighs less than a gig.

homepage
GitHub - tpoisonooo/llama.onnx: LLaMa/RWKV onnx models, quantization and testcase
GitHub - tpoisonooo/llama.onnx: LLaMa/RWKV onnx models, quantization and testcase
  • tpoisonooo
  • github.com
Download onnx models here: Model Precision Size URL Demo News Features Release LLaMa-7B and RWKV-400M onnx models and their onnxruntime standalone demo No or required Support memory pool, works on 2GB laptop/PC (very slow 🐢) Visualization . crashed on LLaMa model. LLM visualization tool must support nest or operator folding feature Quatization...
 
NVIDIA выпустила демоверсию бота Chat with RTX для локальной установки на Windows . Бот не имеет встроенной базы знаний и работает со всеми доступными данными на конкретном компьютере, плюс может обрабатывать содержимое YouTube видео по ссылкам. Для установки боту нужно не менее 40 Гб на диске, и GPU серии RTX 30/40 с минимум 8 Гб видеопамяти.
There was a news story like this the other day
Reason: