ERP

What is the Cost of Hosting & Running a Self-owned LLAMA in AWS?

Due to a variety of reasons, some business cline to employ less powerful yet open-sourced LLM in their AI (artificial intelligence) applications.

People making such decision usually want to save cost of calling API or have a distrust on the data protection facilities of AI service providers.

Multiable does not have any stand on adopting such approach. Anyway, let’s take a look on running cost of adopting a self-owned LLAMA, one of the most popular open-sourced LLM at the moment.

Many people commit a mistake when they plan to setup a ‘usable’ LLAMA in cloud, notably ignoring a range of cloud services necessary for production run.

Yup, may 1 out of 5 IT guys cline to use cost of an UAT environment to apply budget from management and then, things turn ugly when system live runs!!!

In fact, hosting a Large Language Model Architecture (LLAMA) in Amazon Web Services (AWS) involves several cost components associated with different AWS services.

  1. Amazon EC2 (Elastic Compute Cloud):
    • Pricing depends on the instance type and configuration chosen. For hosting LLAMA, a GPU instance such as the p3.2xlarge is recommended for intensive machine learning tasks.
    • p3.2xlarge Instance: Approx. USD3.06 per hour.
    • p3.8xlarge Instance: Approx. USD12.24 per hour.
    • Reserved Instances and Spot Instances can offer significant cost savings.
  2. Amazon S3 (Simple Storage Service):
    • Used for storing datasets and model checkpoints.
    • Standard Storage: USD0.023 per GB per month.
    • Infrequent Access Storage: USD0.0125 per GB per month.
    • Glacier Storage (for archived models): USD0.004 per GB per month.
  3. Amazon EBS (Elastic Block Store):
    • Provides persistent block storage for use with EC2 instances.
    • General Purpose SSD (gp2): USD0.10 per GB per month.
    • Provisioned IOPS SSD (io1): Varies based on provisioned IOPS and storage size.
  4. Amazon VPC (Virtual Private Cloud):
    • Networking costs may be incurred for data transfer between services.
    • Data Transfer Out: First 1 GB per month is free, USD0.09 per GB for up to 10 TB per month.
  5. AWS Lambda:
    • For any serverless functions required in processing.
    • Lambda Functions: USD0.20 per 1 million requests, plus USD0.00001667 per GB-second of compute time.
  6. Amazon CloudWatch:
    • Monitoring and logging services for the infrastructure.
    • Custom Metrics: USD0.30 per metric per month.
    • Logs: USD0.50 per GB ingested, USD0.03 per GB archived.

Identifying the complete annual cost for hosting and running a self-owned Language Learning Model Architecture (LLAMA) in AWS depends on several factors, including computing power, data storage, network transfer costs, and other ancillary services.

Compute: AWS provides various instances suitable for large language models, such as GPU-based EC2 instances. For example, using a p3.8xlarge instance, which costs approximately USD12.24 per hour, running continuously would average around USD107,136 annually.

Storage: Amazon S3 or EBS provides flexible storage options. High-performance EBS might cost about USD0.10 per GB-month. With an assumed need of 10 TB, storage costs might hover around USD12,000 annually.

Network Transfer: Data transfer costs vary but for significant data outputs and inputs, estimating a monthly charge of USD500 could result in USD6,000 annually.

Additional Services: Utilizing AWS Lambda, API Gateway, or other services can add another USD5,000 in auxiliary costs.

Here is a rough estimation. The total annual costs shall be around:

  • Compute: USD107,136
  • Storage: USD12,000
  • Network Transfer: USD6,000
  • Ancillary Services: USD5,000

Total Estimate: Approximately USD130,136 annually.

Please note that the above just covers cloud service costs charged by AWS. Labour costs involved are not mentioned and can vary a lot based on requirement of individual customer.

LAIDFU, a configurable enterprise AI Agent powered by no-code approach, allows user to employ different AI service provider in their application, ranging from OpenAI, Baidu to self-owned DeepSeek or LLAMA. User is free to pick the most appropriate LLM to run the user-defined use cases within various business processes.

Contact us

    Bitnami