Software Engineers Community

by Jarusv7

8 months ago

BigData- How does data parsing happen in your company?

Hi community, Context: In my current company, we have a data-pipeline, which in short works like this: we get raw events from Kafka dumped in S3. We run a batch job (Airflow), this job essentially picks up the raw jsons in s3, enforces data parser logic (we have a service written in python where we explicitly define what attributed we want from raw json, these attributes are accordingly parsed), the parsed data is then converted to CSV/parquet formats and dumped in s3 in another folder and later loaded into tables, which is used for analytics, etc. Problem: Today for every new event we generate, we have to write a parser logic from scratch, if the event structure is different. In case of small changes we can update attributes we want to parse in code itself. But post that we have to deploy changes which takes time. Is there a smarter way of doing it? For example, having a UI interface, where we select the attributes from json (that could include nested attributes), and that is parsed and dumped in s3, later loading happens. And if we want to update parser, we can do so from UI itself than going into code updating things, deploying, etc. Do we have any open source alternatives here? Or any good engineering blogs which has covered such/similar scenario?

Dataguy69

Latent View Analytics Limited

8 months ago

Try asking it in subreddit here https://www.reddit.com/r/dataengineering/s/mcFo9ng1t0

Jarusv7

Amazon

8 months ago

Noted. Thanks buddy.

Jarusv7

Amazon

8 months ago

In simple words, I want to have an abstraction over the raw data I wanna parse, and make things language agnostic.

Discover More

Curated from across

Software Engineers on

by AjDaruBhaiPilayega

TCS

10 months ago

Any Data engineers here? Need some suggestions

Hello everyone, I'm currently working as a data engineer and trying to upskill myself. My current tech stack is python, pyspark,pandas and SQL. Use s3 for storage and Apache airflow for jobs. I have a few questions regarding the same and would like to know from an experienced person. Thanks

Data Scientists on

by AjDaruBhaiPilayega

TCS

10 months ago

Any Data engineers here? Need some suggestions.

Software Engineers on

by Itachi97

Founder

a year ago

How so you end up managing tracking event stream of customer data?

How do you manage event stream, external product/in house? Same for Data pipelines Same for reporting

Office Gossip on

by monkeyDluffy

Stealth

a year ago

Rant of a Data Analyst

I work for a growing startup in Bangalore. The data team size is very small with no data lead. Not a single query we hit the DB gives any result. Owing to cost cutting my firm has cut down the instance size and they're planning to downgrade the DB too. We data analysts are relying on the data engineering team to fetch us results for our queries which takes a long time to get a single table. This is highly inefficient. The data size is growing at a massive scale however the analytics efforts are trickling down due to poor infrastructure. Is this happening in your data team too? How can I solve this problem given there are no experienced data leaders in our firm to voice out.

Indian Startups on

by ice

Stealth

a year ago

Data issues

Is anyone solving data problems at their org without using fancy tools like snowflake etc?

Misc on

by Himanshu_nss

Nice software solutions Pvt Ltd

2 years ago

Data Engineering Transition

I am currently working as a Business Intelligence Consultant. I have 3 years of experience in Tableau, Power-BI and SQL. Could someone in the community please tell me how can I transition into data engineering and which companies should I target ? Thank You !

Misc on

by UnpaidIntern

Grapevine

7 months ago

Watercooler: 20th Feb

Hello folks, Hope you've all been well. It's been some time since we did this, so thought of starting a watercooler :) To people who are new: Welcome to the watercooler: the place for any conversations that feel a bit small to create a dedicated post. Manager acting weird today? Ended up being really late to a meeting and got angry looks? Need show recommendations? This is probably the place for it. Look forward to hearing from you all :) - The Unpaid Intern* (*Disclaimer: I am actually paid though)

Product Managers on

by Wannabe_PM

Stealth

5 months ago

How do you measure data and make decisions based on that?

As the title says I want to learn how and which tools do product people in startups use to query data, make dashboards, do A/B testing and all that. I'm working at a startup where none of this is practiced, so I want to know how other folks do it as most startups in the growth stage and above do this and expect the candidate to be aware of the tools, setup etc.

Indian IT on

by Metalowl

TCS

6 months ago

Technology for data engineer

Is Informatica and Tableu relevant in Market? Should I learn it? If not what can I learn in this sector.

Indian Startups on

by Helpme001

Accenture

5 months ago

Pls advise.... Planning to switch ti Data Engineering job from Database support

Hi Everyone, I have 5 ys experience working in Database support with having knowledge in SQL, Scripting and python. I want to switch my role to a Data Engineering job as my current role is not having enough job opportunities outside. I am planning to learn tools on Data Engineering. Like the Data Frameworks: Airflow,SSIS,Azure Factory And Dataworks like Snowflake,Databricks,Postgres I need some guidance on which tool to start learning and if I plan to learn these tools for the next 6 months. Is it possible to crack the interviews....this is my 1st switch to a new company. Can anyone pls suggest which framework, Dataworks tool i need to start and has high possibility for getting switch to new company. If possible can you pls share the learning path 🙏🙏🙏. Thanks in advance.