FAQ
Frequently Asked Questions
Technical FAQs
What is the SLA for Arcee Orchestra when consumed via cloud SaaS?
Uptime Commitment: 99.95% uptime, allowing for approximately 9 hours of downtime per year. Orchestra's platform's architecture ensures that updates or upgrades to individual services within the cluster do not disrupt other functionalities.
Response Times: 24/7 on-call dev support with a response time of 60 minutes for critical issues. Resolution times will depend on complexity but are expected to be within hours, not days.
Credits for SLA Breaches: Credits will be applied to account for workflow executions lost during periods of SLA breaches.
Runtime limitations for Arcee Orchestra
Workflows: Workflows execute along the graph node by node so there are no node or timeout limitations.
Integrations: the provided integrations access external systems through APIs. There are no limitations on Arcee’s platform to restrict the number of requests to an external source; however, some companies build limitations into their own APIs which must be followed.
Code Node: Individual code nodes have a 30 second runtime timeout. However, processes can be separated into distinct code nodes, as there is no code execution limit at the workflow level. Not all PyPi packages are enabled by default, but we can make most libraries that customers need available.
What are all the available built-in integrations?
You can find a list of the available integrations here.
Is there any portability of the system?
For each workflow you create, the configuration can be downloaded. This means you'll have the prompt templates for each model, what integrations you're connecting with, and any code written in the code nodes. This allows for portability of information and configurations.
With the Enterprise "Shared" tier, could the performance suffer due to multiple tenants using at the same time?
Our infrastructure supports multiple simultaneous workflows across numerous tenants. The application is built on Kubernetes, allowing for auto-scaling to accommodate high levels of traffic.
Is there any guarantee of input/output token speed?
For our models, the token bandwidth is 380 Tokens Per Second (TPS) per 10 users. This means the slowest feasible speed per user before scaling occurs is approximately 38 TPS. However, the average TPS for tiny, small, and medium models is 100, 70, and 50, respectively. So, we can confidently guarantee a minimum of 38 TPS for these tiers. For large models, the minimum is likely around 30 TPS.
What if we want to upgrade to "dedicated", is there a path for an upgrade?
Transitioning to a dedicated environment is straightforward from an infrastructure perspective. When users upgrade, we clone their data and set up a new environment, a process that can be completed within five days.
Details of onboarding, training, and hands-on assistance from the CS team once a contract is signed?
We start with a 1 hour onboarding session where our CS team meets with your team to get Orchestra set up and teach you the basics of the system. We then schedule 30 minute syncs based on the progress made from your team on the workflows. You can request up to 5 hours of support per week.
Last updated