What is the Cost of Hosting & Running a Self-owned LLAMA in AWS?

Due to a variety of reasons, some business cline to employ less powerful yet open-sourced LLM in their AI (artificial intelligence) applications.

People making such decision usually want to save cost of calling API or have a distrust on the data protection facilities of AI service providers.

Multiable does not have any stand on adopting such approach. Anyway, let’s take a look on running cost of adopting a self-owned LLAMA, one of the most popular open-sourced LLM at the moment.

Many people commit a mistake when they plan to setup a ‘usable’ LLAMA in cloud, notably ignoring a range of cloud services necessary for production run.

Yup, may 1 out of 5 IT guys cline to use cost of an UAT environment to apply budget from management and then, things turn ugly when system live runs!!!

In fact, hosting a Large Language Model Architecture (LLAMA) in Amazon Web Services (AWS) involves several cost components associated with different AWS services.

Amazon EC2 (Elastic Compute Cloud):
- Pricing depends on the instance type and configuration chosen. For hosting LLAMA, a GPU instance such as the p3.2xlarge is recommended for intensive machine learning tasks.
- p3.2xlarge Instance: Approx. USD3.06 per hour.
- p3.8xlarge Instance: Approx. USD12.24 per hour.
- Reserved Instances and Spot Instances can offer significant cost savings.
Amazon S3 (Simple Storage Service):
- Used for storing datasets and model checkpoints.
- Standard Storage: USD0.023 per GB per month.
- Infrequent Access Storage: USD0.0125 per GB per month.
- Glacier Storage (for archived models): USD0.004 per GB per month.
Amazon EBS (Elastic Block Store):
- Provides persistent block storage for use with EC2 instances.
- General Purpose SSD (gp2): USD0.10 per GB per month.
- Provisioned IOPS SSD (io1): Varies based on provisioned IOPS and storage size.
Amazon VPC (Virtual Private Cloud):
- Networking costs may be incurred for data transfer between services.
- Data Transfer Out: First 1 GB per month is free, USD0.09 per GB for up to 10 TB per month.
AWS Lambda:
- For any serverless functions required in processing.
- Lambda Functions: USD0.20 per 1 million requests, plus USD0.00001667 per GB-second of compute time.
Amazon CloudWatch:
- Monitoring and logging services for the infrastructure.
- Custom Metrics: USD0.30 per metric per month.
- Logs: USD0.50 per GB ingested, USD0.03 per GB archived.

Identifying the complete annual cost for hosting and running a self-owned Language Learning Model Architecture (LLAMA) in AWS depends on several factors, including computing power, data storage, network transfer costs, and other ancillary services.

Compute: AWS provides various instances suitable for large language models, such as GPU-based EC2 instances. For example, using a p3.8xlarge instance, which costs approximately USD12.24 per hour, running continuously would average around USD107,136 annually.

Storage: Amazon S3 or EBS provides flexible storage options. High-performance EBS might cost about USD0.10 per GB-month. With an assumed need of 10 TB, storage costs might hover around USD12,000 annually.

Network Transfer: Data transfer costs vary but for significant data outputs and inputs, estimating a monthly charge of USD500 could result in USD6,000 annually.

Additional Services: Utilizing AWS Lambda, API Gateway, or other services can add another USD5,000 in auxiliary costs.

Here is a rough estimation. The total annual costs shall be around:

Compute: USD107,136
Storage: USD12,000
Network Transfer: USD6,000
Ancillary Services: USD5,000

Total Estimate: Approximately USD130,136 annually.

Please note that the above just covers cloud service costs charged by AWS. Labour costs involved are not mentioned and can vary a lot based on requirement of individual customer.

LAIDFU, a configurable enterprise AI Agent powered by no-code approach, allows user to employ different AI service provider in their application, ranging from OpenAI, Baidu to self-owned DeepSeek or LLAMA. User is free to pick the most appropriate LLM to run the user-defined use cases within various business processes.

Learn More about LAIDFU