Data Scientists Community

by Gooner7

4 months ago

This Research Paper changed my life forever.

It was one of the papers that was discussed in my interview at Goldman. I came to know about this research paper a few years back after consulting a friend doing an ML PhD at University of Maryland, College Park. The explanation of the paper: 1. Initialize the neural network with small random values typically (-0.1,0.1) to avoid symmetry issues. 2. Now get ready to do Forward propagation: you pass thetraining data through the multilayer perceptron and compute the output. For each neuron in the MLP, calculate the weighted sum of its inputs and apply the activation function. (my favourite is tanh for LSTM applications) 3. Now compute the loss using a loss function like mean squared error, between output computed and the actual value. 4. Now get ready to do backpropagation, where you need to calculate the gradient of the loss function with respect to each weight by propagating the error backward through the network. 5. So, compute partial derivatives of the loss with respect to each weight, starting from the output layer and moving back to the input layer. 6. Here is the fun part: update the weights using the gradients obtained from the backward pass. here people usually use adam optimizer, which allows for accelerated stochastic gradient descent. Fun trivia: Adam stands for "Adaptive Moment Estimation". 7. Now repeat the forward and backward propagation process for numerous tries until theperformance of the model stabilizes.

https://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf

FairyMermaid

Stealth

4 months ago

I appreciate you posting these papers, but why the cutoff of 100 likes ? Is it for grapevine chat facilities? That is a pain for me as well. I can not dm people I want to speak to

Gooner7

Goldman Sachs

4 months ago

@FairyMermaid Because people need to value my content. Else, I am not putting the effort in.

Sane

KPMG

4 months ago

People might not leave a like and still appreciate the content. On the contrary they will just pass by & won’t even read it after liking. You analysis is absolutely wrong to gain the respect for content. Value those people who stopped by and read it. Time is money

Welt

Remote

4 months ago

True,the backprop truly changed the ai. From backprop to generalist ai we come so far.

Gooner7

Goldman Sachs

4 months ago

@Welt_yang @Welt_yang Seriously.

Sane

KPMG

4 months ago

It was a good read! Thanks for posting🥂

AjaxNinja

Consultant

4 months ago

Are you aware of any latest research that analyses intelligence/consciousness from a meta/abstract formulation pov and not the i/o neuron paradigm?

Spectrum

Akamai Technologies

4 months ago

The original backprop paper is just 4 pages long?!

Discover More

Curated from across

Data Scientists on

by altGrape

Stealth

5 months ago

As someone in AI, which concept blew your mind away when you first learnt about it?

For me? It was GradCAM was a gamechanger at selling computer vision initiatives internally to the non-technical stakeholders. The gradCAM function computes the importance map by taking the derivative of the reduction layer output for a given class with respect to a convolutional feature map. If you have a 5 layered convolutional neural network, then you can use this against any layer and check the class activation map. The best explanation to give is: "Regions in Red are the areas of the image that the neural network is looking at to make a decision" Well to be honest, regions in red represent the class activations arising from that region but, it would be too much for normie business guys.

Data Scientists on

by Gooner7

Goldman Sachs

4 months ago

OpenAI Cofounder: "Learn these 30 research papers and you will know 90% of what matters today."

Please like and bookmark this if you find it useful. I have acess to the top AI researchers and ML PhDs working on the cutting edge in India and the US, will post some good content if this crosses 100 likes. Ilya Sutskever, OpenAI cofounder, gave John Carmack this reading list of approximately 30 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ FYI: John Carmack is also considered to be the greatest programmer of all time.

https://arc.net/folder/D0472A20-9C20-4D3F-B145-D2865C0A9FEE

Data Scientists on

by Babel

Yubi

4 months ago

The Research paper that changed the world...

Anything happening in AI today can be traced back to this one brief moment in history... Share your favourite papers. GPT ftw :)

https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Data Scientists on

by Gooner7

Goldman Sachs

4 months ago

This paper is the revolution of AI in selecting stocks for investment...

I was talking to my friend from University of Maryland, College Park and he told me to read this paper. I will continue to share more such papers if everyone who reads this post, decides to upvote this. Next paper at 50 upvotes.

https://proceedings.neurips.cc/paper_files/paper/1996/file/1d72310edc006dadf2190caad5802983-Paper.pdf

Data Scientists on

by salt

Gojek

10 months ago

When do you stop training a Neural Network?

1. Early cut off looking at training_loss 2. Early cut off looking at validation_loss 3. Set Higher Regularization? Increasing data is not an option, have used SMOTE related technqiues for interpolating more data for the imbalanced classes already. Just trying to understand what do other folks do.

Data Scientists on

by Gooner7

Goldman Sachs

3 months ago

Paper with Code: You can now run LLMs without Matrix Multiplications

Saw this paper: https://arxiv.org/pdf/2406.02528 In essence, MatMul operations can be completely eliminated from LLMs while maintaining strong performance at billion-parameter scales and by utilising an optimised kernel during inference, their model’s memory consumption can be reduced by more than 10× compared to un-optimised models. source: https://x.com/rohanpaul_ai/status/1799122826114330866

Implementation for MatMul-free LM. Contribute to ridgerchu/matmulfreellm development by creating an account on GitHub.

https://github.com/ridgerchu/matmulfreellm

Data Scientists on

by Gooner7

Goldman Sachs

3 months ago

ImageNet Classification with Deep Convolutional Neural Networks

The 2012 breakthrough by Krizhevsky(of AlexNet fame), Sutskever(Co-Founder at OpenAI), and Hinton(Godfather of AI) with AlexNet revolutionized AI. By using deep convolutional neural networks and leveraging GPUs for training, they achieved insane accuracy on ImageNet dataset. This not only validated deep learning's potential but also introduced key innovations like ReLU activations, dropout, and data augmentation.

Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton

https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf