How to scale up in SDE role
I have 4 years of experience and I feel I am pretty good at what I do but how do I scale up from here, how do I learn new things? Please suggest and help me out
Hi all, currently working at startup and we have started to face scaling issues. Example Database tables have started increasing rows. So how do you handle this kind of data? How to think in the direction of improving and writing queries on billions of records to perform optimal? Can people with experience help me with any resources or personal experience they have gone through when going from 1 to 100? Minor of the minor suggestions would help. Grateful and thanks to all in advance.🙌
Check all indices (indexes)
Find queries from your monitoring tool which are consuming maxiumum CPU. Run Explain/Analyze on those.
Move read queries to Read Replica and write queries to master.
Look for caching popular queries results.
Creare appropriate shards if needed.
Run batch jobs to archive stale/unused data from DB to s3
Try to move tables to NoSql (tables which doesn’t need any joins to start off with, ignore tables where you need to have transactional properties)
Thanks for the detailed answers We have applied indexes as well, but if suppose we apple where query on indexed attribute and that attribute is given array of 50k IDs, then indexes are also not use ful. So can you suggest any better way for this?
An alternative approach is to reconsider your data model. Depending on your use case, you might create a separate document for each ID or group IDs in a way that aligns with your query patterns. This can lead to more efficient queries and better scalability.
Which databases? What the current ingestion rate? What it will be in a year? Whats the current resource allocation. You expect ppl to answer without any of these. I can suggest a fancy system design which can support millions of read write but that doesn’t mean it can solve your problem. Each use case has its own solutions.
Ppl jumping in the thread without asking the constraints, if I was an interviewer would have rejected solely based on jumping to conclusions without asking finer details
Ok, so the database is MySQL. Ingestion rate is not known to me. It is more read heavy system as we have a no SQL db where we store computed data. So if specific table is updated and suppose this table impact 5 to 6 tables of no SQL then too much read will happen to compute data. And also tables in MySQL have started growing due to more clients coming. So on some tables basic where query is stating to take time. So how can I grow this system?
Get slow queries logging enabled, once you have that use explain analyse to see if its doing seq/table scan. If thats the case the fields which are most queried on and the where conditions column create indexes on those columns, re-run the explain thing validate of its doing a index scan now. This is the starting point. You can further denormalise the tables if its possible
Shard/read replica, index, partition Note: indexing will slow down writes Some types of indexes are faster than others
-I worked on a very large scale data warehouse at a financial institution
I am sure you will be able to find it on Google. Just deep dive before asking here. It will help you in the long run.
I have 4 years of experience and I feel I am pretty good at what I do but how do I scale up from here, how do I learn new things? Please suggest and help me out
Hi all,
Wanted to know the size of database in your respective orgs and choice of db. Also pls mention any challanges faced while scaling it
Mine Main db around 3 terra bytes ( no sharding so on a single node with replication) Choice ...
Just how do you scale? Do you use AWS auto-scaler for that?
Seems like you are early in your DevOps/BE journey. The way you do it is: 1. Build performant APIs. <300ms is good a...