Explore Companies

Software Engineers Community

by Kendall Lee

10 months ago

BigData- How does data parsing happen in your company?

Hi community, Context: In my current company, we have a data-pipeline, which in short works like this: we get raw events from Kafka dumped in S3. We run a batch job (Airflow), this job essentially picks up the raw jsons in s3, enforces data parser logic (we have a service written in python where we explicitly define what attributed we want from raw json, these attributes are accordingly parsed), the parsed data is then converted to CSV/parquet formats and dumped in s3 in another folder and later loaded into tables, which is used for analytics, etc. Problem: Today for every new event we generate, we have to write a parser logic from scratch, if the event structure is different. In case of small changes we can update attributes we want to parse in code itself. But post that we have to deploy changes which takes time. Is there a smarter way of doing it? For example, having a UI interface, where we select the attributes from json (that could include nested attributes), and that is parsed and dumped in s3, later loading happens. And if we want to update parser, we can do so from UI itself than going into code updating things, deploying, etc. Do we have any open source alternatives here? Or any good engineering blogs which has covered such/similar scenario?

Blair Nadeen

Latent View Analytics Limited

10 months ago

Try asking it in subreddit here https://www.reddit.com/r/dataengineering/s/mcFo9ng1t0

Kalan Vernon

Amazon

10 months ago

Noted. Thanks buddy.

Coy Nadeen

Amazon

10 months ago

In simple words, I want to have an abstraction over the raw data I wanna parse, and make things language agnostic.

Sign in to a Grapevine account for the full experience.

Discover More

Curated from across

Software Engineers on

by Isaiah Carmden

TCS

Any Data engineers here? Need some suggestions

Data Scientists on

by Karilyn Denver

TCS

Any Data engineers here? Need some suggestions.

Software Engineers on

by Jordon Lee

Founder

How so you end up managing tracking event stream of customer data?

Office Gossip on

by Karilyn Dean

Stealth

Rant of a Data Analyst

Indian Startups on

by Coy Taye

Stealth

Data issues

Misc on

by Matilda Carmden

Nice software solutions Pvt Ltd

Data Engineering Transition

Misc on

by Aaron Lee

Grapevine

Watercooler: 20th Feb

Product Managers on

by Isaiah Carmden

Stealth

How do you measure data and make decisions based on that?

IT Company Discussion on

by Jordon Olive

TCS

Technology for data engineer

Indian Startups on

by Matilda Dean

Accenture

Pls advise.... Planning to switch ti Data Engineering job from Database support

Data Scientists on

by Jordon Lee

Hidden

Can someone please tell me my role?

Software Engineers on

by Kendall Hyrum

Stealth

Software engineering job is highly overrated. Prove me wrong if you can.

Software Engineers on

by Blair Lee

Stealth

Does your company document & maintain API? If yes, How?

Data Scientists on

by Dezi Lee

Stealth

What kind of DS work do you do?

Software Engineers on

by Jordon Gabriel

PayU

Most common question in an interview

Indian Startups on

by Jordon Carmden

Accenture

Pls advise.... Planning to switch ti Data Engineering job from Database support

Business Roles on

by Karilyn Vernon

Deloitte

Myths busted by a Data Scientist

Data Scientists on

by Kendall Lee

Accenture

CTC range for Python Developer with 4+ yoe

IT Company Discussion on

by Aaron Lee

Infosys

Skill set request

Home
BigData- How does data parsing happen in your company?

Download the Grapevine app.

Help & Support support@gvine.app

Privacy Policy Community Guidelines

Grapevine™ 2024, All rights reserved