Integrate Your Own LLM into EA (Part 1): Hardware and Environment Deployment

MetaTrader 5 — Trading | 18 October 2023, 10:39

2 204

Yuqiang Pan

Introduction

This article is mainly aimed at those who hope to economically and practically deploy LLMs in a local environment and can fine-tune the model as needed. We will detail the required hardware configuration and software environment configuration, and provide some practical suggestions and tips.

We hope that by reading this article, you will have a clear understanding of the hardware conditions required to deploy LLMs, and be able to successfully build a basic deployment environment according to the guidance of this article.

Table of contents:

What Is the LLMs

1. About LLMS

LLMs (large language models) are fundamental models that utilize deep learning in natural language processing (NLP) and natural language generation (NLG) tasks. To help them learn the complexity and connections of language, large language models are pre-trained on large amounts of data. LLM is essentially a neural network based on a Transformer (Don't ask me what Transformer is, haven't you heard of " Attention is All You Need "?).By LLMs, we mainly refer to open-source models, such as LLama2. Like OpenAI's closed-source ChatGPT is powerful, but it can't take advantage of our personalized data. Of course, they can also be integrated with algorithmic trading, but that is not what this series of articles is about.

2. Why and How to Apply LLMs to Algorithmic Trading

Why did we choose LLMS? The reason is simple: they have strong logical understanding and reasoning skills. So here's the problem, how can we use this powerful capability in algorithmic trading? As a simple example, what would you do if you want to make an algorithmic trading by using the Fibonacci sequence? If you implement it with traditional programming methods, it's a brutal thing. But if we use LLMs, we only need to make the corresponding data set according to Fibonacci theory, and then use this data set to fine-tune the LLMs, and directly call the results generated by LLMs in our algorithmic trading. Everything just got easier.

3. Why Deployment LLMs Locally

1). Local deployment can provide higher data security. For some applications involving sensitive information, such as medical, financial and other fields, data security is crucial. By deploying LLMs in a local environment, we can ensure that data does not leave our control range.

2). Local deployment can provide higher performance. By choosing the appropriate hardware configuration, we can optimize the running efficiency of the model according to our needs. In addition, the network latency in the local environment is usually lower than that in the cloud environment, which is very important for some applications that require real-time response.

3). By deploying LLMs in a local environment and fine-tuning them according to our personalized data, we can get a model that better meets our needs. This is because each application scenario has its uniqueness, and through fine-tuning, we can make the model better adapt to our application scenario.

In summary, local deployment of LLMs can not only provide higher data security and performance but also allow us to get a model that better meets our needs. Therefore, understanding how to deploy LLMs in a local environment and fine-tune them according to our personalized data is very important for how we integrate into algorithmic trading. In the following content, we will detail how to choose hardware configuration and build software environment.

Hardware Configuration

When deploying LLMs locally, hardware configuration is a very important part. Here we mainly discuss mainstream PCs, and do not discuss MacOS and other niche products.

The products used to deploy LLMs mainly involve CPU, GPU, memory, and storage devices. Among them, the CPU and GPU are the main computing devices for running models, and memory and storage devices are used to store models and data.

The correct hardware configuration can not only ensure the running efficiency of the model but also affect the performance of the model to a certain extent. Therefore, we need to choose the appropriate hardware configuration according to our needs and budget.

1. Processor

Currently, due to the emergence of frameworks like Llama.cpp, language models can be directly inferred or trained on CPUs. So if you don't have a standalone graphics card, don't worry, the CPU is also an option for you.

At present, the main processor brands on the market are Intel and AMD. Intel's Xeon series and AMD's EPYC series are currently the most popular server-level processors. These processors have high frequency, multi-core, and large cache, which are very suitable for running large language models (LLMs).

When choosing a processor, we need to consider its number of cores, frequency, and cache size. Generally speaking, processors with more cores, higher frequency, and larger cache perform better, but they are also more expensive. Therefore, we need to choose according to our own budget and needs.

For local deployment of LLMs, we recommend the following hardware configuration:

CPU: At least 4 cores with a frequency of at least 2.5GHz

The following is the current CPU market situation for reference:

cpu

You can find more information in here: Pass Mark CPU Benchmarks

2. Graphics Card

For running large language models (LLMs), graphics cards are very important. The computing power and memory size of the graphics card directly affect the training and inference speed of the model. Generally speaking, graphics cards with stronger computing power and larger memory perform better, but they are also more expensive.

NVIDIA's Tesla and Quadro series as well as AMD's Radeon Pro series are currently the most popular professional graphics cards. These graphics cards have a large amount of memory and powerful parallel computing capabilities that can effectively accelerate model training and inference.

For local deployment of LLMs, we recommend the following hardware configuration:

Graphics card with 8G or more memory.

The following is the current GPU market situation for reference:

gpuy

You can find more information in here: Pass Mark Software - Video Card (GPU) Benchmarks

3. Memory

DDR5 has become mainstream. For running large language models (LLMs), we recommend using as much memory as possible.

For local deployment of LLMs, we recommend the following hardware configuration:

At least 16GB of DDR4 memory.

4. Storage

The speed and capacity of storage devices directly affect data reading speed and data storage capacity. Generally speaking, storage devices with faster speed and larger capacity perform better but are also more expensive.

Solid State Drives (SSDs) have become mainstream storage devices due to their high speed and low latency. NVMe interface SSDs are gradually replacing SATA interface SSDs due to their higher speed. Solid State Drives (SSDs) have become mainstream storage devices due to their high speed and low latency.

For local deployment of LLMs, we recommend the following hardware configuration:

At least 1TB NVMe SSD.

Software Environment Configuration

1. Building and Analysis of Different Operating System Environments

When deploying LLMs locally, we need to consider the choice of operating system environment. Currently, the most common operating system environments include Windows, Linux, and MacOS. Each operating system has its own advantages and disadvantages, and we need to choose according to our own needs.

- Windows:

The Windows operating system is loved by a large number of users because of its user-friendly interface and rich software support. However, due to its closed-source nature, Windows may not be as good as Linux and MacOS in some advanced features and customization.

- Linux:

The Linux operating system is known for its open source, highly customizable, and powerful command line tools. For users who need to perform large-scale computing or server deployment, Linux is a good choice. However, due to its steep learning curve, it may be a bit difficult for beginners.

- MacOS:

The MacOS operating system is loved by many professional users for its elegant design and excellent performance. However, due to its hardware restrictions, MacOS may not be suitable for users who need to perform large-scale computing.

- Windows + WSL:

WSL allows us to run a Linux environment on the Windows operating system, which allows us to enjoy the user-friendly interface of Windows while using the powerful command line tools of Linux.

The above is our simple analysis of different operating system environments. We strongly recommend using Windows + WSL, which can take care of daily use and serve our professional computing.

2. Related Software Configuration

After choosing the appropriate hardware configuration, we need to install and configure some necessary software on the operating system. These softwires include the operating system, programming language environment, development tools, etc. Here are some common software configuration steps:

- Operating System:

First, we need to install an operating system. For training and inference of large language models (LLMs), we recommend using the Linux operating system because it provides powerful command line tools and a highly customizable environment.

- Programming Language Environment:

We need to install the Python environment because most LLMs are developed using Python. We recommend using Anaconda to manage the Python environment because it can easily install and manage Python packages.

- Development Tools:

We need to install some development tools such as text editors (such as VS Code or Sublime Text), version control tools (such as Git), etc.

- LLMs Related Software:

We need to install some software related to LLMs such as TensorFlow or PyTorch deep learning frameworks, as well as Hugging Face's Transformers library etc.

3. Recommended Configuration

Here are some recommended software tools and versions:

- Operating System:

We recommend using Windows11+Ubuntu 20.04 LTS (WSL). This is a stable and widely used Linux distribution with a rich package and good community support.

- Python Environment:

We recommend using Anaconda to manage the Python environment. Anaconda is a popular Python data science platform that can easily install and manage Python packages. We recommend using Python 3.10 because it is a stable and widely supported version.

- Deep Learning Framework:

We recommend using PyTorch 2.0 or above which is one of the most popular deep learning.

Cloud computing platform

1. Platform

Here are some of the most popular ones:

1). Amazon Web Services (AWS)

Amazon Web Services (AWS) is a cloud computing platform provided by Amazon, Inc., with data centers around the world offering more than 200 full-featured services. AWS' services span infrastructure technologies such as compute, storage, and databases, as well as emerging technologies such as machine learning, artificial intelligence, data lakes and analytics, and the Internet of Things. AWS' customers include millions of active users and organizations of all sizes and industries, including startups, large enterprises, and government agencies.

2). Microsoft Azure

Microsoft Azure is a cloud computing platform provided by Microsoft that has the following advantages. Azure has multiple layers of security measures that cover the data center, infrastructure, and operations layers to protect customer and organizational data. Azure also offers the most comprehensive compliance coverage, supporting more than 90 compliance products and industry-leading service-level agreements. Azure enables seamless operation across on-premises, multiple clouds, and edge environments, with tools and services designed for hybrid clouds, such as Azure Arc and Azure Stack1. Azure also allows customers to use their own Windows Server and SQL Server licenses and save up to 40% on Azure. Azure supports all languages and frameworks, allowing customers to build, deploy, and manage applications in the language or platform of their choice. Azure also provides a wealth of AI and machine learning services and tools that allow customers to create intelligent applications and leverage OpenAI Whisper models for high-quality transcription and translation. Azure offers more than 200 services and tools that allow customers to achieve unlimited scale of applications and take advantage of innovations in the cloud. Azure has rolled out more than 1,000 new features in the last 12 months, covering areas such as AI, machine learning, virtualization, Kubernetes, and databases.

3). Google Cloud Platform (GCP)

Google Cloud Platform (GCP) is a suite of cloud computing services and tools offered by Google. GCP allows customers to build, deploy, and scale applications, websites, and services on the same infrastructure as Google. GCP provides various technologies such as compute, storage, databases, analytics, machine learning, and Internet of Things. GCP is used by millions of customers across different industries and sectors, including startups, enterprises, and government agencies. GCP is also one of the most secure, flexible, and reliable cloud computing environments available today.

2. Advantages of Cloud Computing

- Cost-Effective: Cloud computing is more cost-effective compared to traditional IT infrastructure, as users only pay for the computing resources they use.

- Managed Infrastructure: The cloud provider manages the underlying infrastructure, including hardware and software.

- Scalability: The cloud is eminently scalable, allowing organizations to easily adjust their resource usage based on their needs.

- Global and Accessible: The cloud is global, convenient, and accessible, accelerating the time to create and deploy software applications.

- Unlimited Storage Capacity: No matter what cloud you use, you can buy all the storage you could ever need.

- Automated Backup/Restore: Cloud backup is a service in which the data and applications on a business's servers are backed up and stored on a remote server.

3. Disadvantages of Cloud Computing

- Data Security: While cloud providers implement robust security measures, storing sensitive data in the cloud can still pose risks.

- Compliance Regulations: Depending on your industry, there may be regulations that limit what data can be stored in the cloud.

- Potential for Outages: While rare, outages can occur with cloud services, leading to potential downtime.

Conclusion

Local deployment of LLMs and fine-tuning with personalized data is a complex but very valuable task, and the correct hardware configuration and software environment configuration are key to achieving this goal.

In terms of hardware configuration, we need to choose the appropriate CPU, GPU, memory, and storage devices according to our own budget and needs. In terms of software environment configuration, we need to choose the appropriate programming language interpreter, deep learning framework, and other necessary libraries according to our own operating system.

Although these tasks may encounter some challenges, as long as we have clear goals and are willing to invest time and energy to learn and practice, we will definitely succeed.

I hope this article can provide you with some insights. In the next article, we will give an example of environment deployment.

Stay tuned!

Warning: All rights to these materials are reserved by MetaQuotes Ltd. Copying or reprinting of these materials in whole or in part is prohibited.

RWKV-4 weighs less than a gig.

homepage

Rashid Umarov | 23 Feb 2024 at 11:41

NVIDIA выпустила демоверсию бота Chat with RTX для локальной установки на Windows . Бот не имеет встроенной базы знаний и работает со всеми доступными данными на конкретном компьютере, плюс может обрабатывать содержимое YouTube видео по ссылкам. Для установки боту нужно не менее 40 Гб на диске, и GPU серии RTX 30/40 с минимум 8 Гб видеопамяти.

There was a news story like this the other day

Discrete Hartley transform

In this article, we will consider one of the methods of spectral analysis and signal processing - the discrete Hartley transform. It allows filtering signals, analyzing their spectrum and much more. The capabilities of DHT are no less than those of the discrete Fourier transform. However, unlike DFT, DHT uses only real numbers, which makes it more convenient for implementation in practice, and the results of its application are more visual.

Learn how to deal with date and time in MQL5

A new article about a new important topic which is dealing with date and time. As traders or programmers of trading tools, it is very crucial to understand how to deal with these two aspects date and time very well and effectively. So, I will share some important information about how we can deal with date and time to create effective trading tools smoothly and simply without any complicity as much as I can.

Structures in MQL5 and methods for printing their data

In this article we will look at the MqlDateTime, MqlTick, MqlRates and MqlBookInfo strutures, as well as methods for printing data from them. In order to print all the fields of a structure, there is a standard ArrayPrint() function, which displays the data contained in the array with the type of the handled structure in a convenient tabular format.

Launching MetaTrader VPS: A step-by-step guide for first-time users

Everyone who uses trading robots or signal subscriptions sooner or later recognizes the need to rent a reliable 24/7 hosting server for their trading platform. We recommend using MetaTrader VPS for several reasons. You can conveniently pay and manage the subscription through your MQL5.community account.

Introduction

What Is the LLMs

Hardware Configuration

Software Environment Configuration

Cloud computing platform

Conclusion

Other articles by this author

RWKV-4 weighs less than a gig.