Any Data engineers here? Need some suggestions
Hello everyone,
I'm currently working as a data engineer and trying to upskill myself. My current tech stack is python, pyspark,pandas and SQL. Use s3 for storage and Apache airflow for jobs. I have a few questions regarding the same and would like to know from an experienced person.
Thanks
Tell us about the questions?
-
Is the tech stack that I'm working on will get me a job or do I need to upskill myself in any other technologies?
-
Do I need aws certification?
-
Previously I had worked as Java developer then made the switch as Data engineer, do I need to mention this in interview going forward or should I say I have experience as data engineer only. My current yoe is 4 years.
-
What type of questions should I expect in interviews?
Oh man! My experience is less than you to answer but I'll try.
-
The tech stack you are working on is right now like market standard, you are doing fine. Spark streaming is a great skill to have to set you apart, along with kafka. One thing I can say is that pick a cloud, say AWS, and go through the basic services - like EMR, EC2, they ask these in interview.
-
AWS certification is all on you, it helps but you can still get interviews. I'll say that if you have time and resources you can go for it.
-
I'll say that mention Java experience only if you can answer questions regarding it, because they love to ask Java questions and mentioning without knowing may backfire.
-
You can expect questions which tests your thorough knowledge on Spark like what are tasks, executors, how does distributed computing work, map-reduce, etc. SQL is also asked. And then how do you schedule jobs like cron, or Airflow or any other tools. Also, as I mentioned - cloud is asked a lot.