Using your own hardware for llms
Things have come a long way since this post
This article assumes you have a powerful machine (referred to as “server”) and you want to use that for running the models (inference) and want to actually interact with them from another machine (referred to as “client”).
Prerequisites
- Jan.ai installed on both the server and client machines
- A specific model of your choice downloaded on the server
- Network connectivity possible between the client and server (e.g., firewalls configured correctly)
Setting up the server
After installing the models in Jan.ai on the server, click on “Local API Server” located near the bottom left corner of the window.
On the hosting page use the following settings
- Set the host IP to “0.0.0.0”
- Choose any desired port number
- In Model Settings, select the model you have already downloaded
- Click “Start Server” to initiate the local API server.
Setting up the client
- Navigate to the Jan.ai configuration folder, which can be found in Settings -> Advanced Settings -> Jan Data Folder.
- Create a file named
model.json
at the pathmodels/local-local/model.json
. - Add the following content to the
model.json
file, replacing$MODEL_ID
with your server’s model ID such ashermes-pro-7b
:
{
"sources": [
{
"url": "https://jan.ai"
}
],
"id": "$MODEL_ID",
"object": "model",
"name": "local test",
"version": "1.0",
"description": "Test server",
"format": "api",
"settings": {},
"metadata": {
"author": "test",
"tags": ["remote"]
},
"engine": "openai",
"state": "ready"
}
- Restart the Jan.ai application on the client machine.
- Access Settings -> OpenAI Inference Engine and enter
http://$IP:$PORT/v1/chat/completions
in the “Chat Completions Endpoint” field, replacing $IP with your server’s IP address and $PORT with the chosen port number on the server. Leave the API key blank or it won’t work.
Using the model running on your server
Using the Model Running on Your Server:
- Create a new thread in the Jan.ai application on the client machine.
- In the model dropdown menu on the right, select “remote” and then choose “local test”.
- Start sending messages as usual and (hopefully) have a faster experience.
Congratulations, you’re now performing inference on another machine!
Conclusion
By following these steps, you can utilise your own hardware for LLM usage and enjoy the benefits of running models and interacting with them from a remote machine. Enjoy the improved performance, privacy and flexibility offered by this setup.