GoofyBagel

Walmart

How do you solve for peak concurrency? >10,000 users at one point?

Just how do you scale? Do you use AWS auto-scaler for that?

6mo ago

14Kviews

Find out if you are being paid fairly.Download Grapevine

FuzzyPretzel

Zepto6mo

Seems like you are early in your DevOps/BE journey. The way you do it is:

Build performant APIs. <300ms is good and <100ms is great.
Cache all db calls wherever you can.
Cache all recurring calls to internal objects.
Profile your code and reduce time complexity.
Spread out API calls across several workers by mutithreading.
Use a load balancer and dynamically allocate more or less servers to it basis CPU util or Requests per second.

SqueakyBiscuit

Swiggy6mo

How about Lambda?

DizzyBoba

Persistent Systems6mo

I solved it using Lambda and Api gateway

FloatingWalrus

Salesforce6mo

From what I remember hotstar used an in-house tooling that scales infra based on request rate and concurrent users per unit time, rather than default metrics like CPU or network usage.

Then it is a little simpler because you increase number of servers proportionally to the demand.

DizzyPotato

Steel Exchange India6mo

Open-source alternative is also available called KEDA for k8s autoscaling

SleepyPanda

Porter6mo

@SpryJunker just being curious.. Isn’t your request rate and concurrent users per unit time finally boils down to CPU usage and memory metrics? How does the scaling based on request rate justified here?

JumpyUnicorn

Dunzo6mo

https://www.youtube.com/watch?v=9b7HNzBB3OQ

This is the best video on an Indian company handling insane concurrency, while 75% of all bandwidth available in India is being consumed at the same time.

GoofyCupcake

Microsoft6mo

I was recommended this just today..

MagicalLlama

Stealth6mo

There is a lot of information missing here but based on what you have given, let's take a crack at this.

Identify where the problem lies - If you're seeing resource crunch then you need to benchmark the service properly before it's going live in production. Give due diligence to performance also along with functional testing.
From a devops point of view if you're seeing a gradual increase in load, you can employ reactive scaling. You can use autoscaling groups (if deployed in ec2) or horizontal pod autoscaler (if using k8s).
For sudden spikes, reactive scaling won't work as by the time it will take to scale, it will already start erroring out. For this scenario you need to go with proactive scaling. You need to study the pattern and build automation around it to predictively scale your infra based on the pattern.

Please note that this answer considers there are no design flaws in your application and it has been optimally designed to handle those many requests coming in.

DizzySushi

Razorpay6mo

@Rakz Very solid. The problem is that most Engineers have never faced problems at scale. So, very few engineers with experience at that going around.

I was about to answer the same. Seems like a Consumer Internet startup problem. SaaS usually is more chill.

MagicalLlama

Stealth6mo

I agree hence I mentioned performance has to be given due diligence along with functional testing. Lots of new startups tend to ignore it but if you're consumer facing, then customers won't come back if you're giving them a laggy experience.

PeppyPotato

Zeta6mo

I would analyze why you are seeing a spike, is there some sale gng on or your marketing team sent notifications or is it just a transient spike.

How you can scale

You can scale based on the concurrent request count, CPU or disk usage
If this spike is part of your traffic pattern you set up auto scaling as in that particular time frame
If this part of some sale or marketing campaign you need to pro actively scale the fleet.

Thanks

DizzySushi

Razorpay6mo

Point 1 is probably the best and most generalizable.

Discover more

Curated from across

Software Engineers1mo

by PerkyDumplingAngel One

Scaling LB

For making highly scalable, highly available applications - applications are put behind a load balancer and LB will distribute traffic between them.

Let say load balancer is reaching its peak traffic then what ? How is traffic handled i...

3.4K

Indian Startups20mo

by SparklyWalrusSpinny

Crickpe failed to scale on day 1

Crickpe failed to scale on day 1. They didn't expect this exponential traffic. Crickpe is build by code brew lab, chandigarh a service based company, their developer might not created such large scale app. Ashneer grover apologised for t...

Indian Startups11mo

by DerpyCoconutStealth

War Room or Cry Room?

While the Internet was divided when Zomato, BlinkIt, OYO, Zepto founders were sharing crazy stats on social media on the evening of the New Year and their War Rooms. I found some of the stats literally nonsensical..

Do you think not hav...

Top comment

I would be more concentrated on how many customer issues were solved rather than how chips did Raju ordered.

Ask a question on Grapevine.

Get the app on Android or iOS.

Privacy Terms

Guidelines Help