More

saip · on Oct 10, 2018

Access to parallel corpora is a limiting factor in general. A good way to train a language translator is to use an open source dataset (several here http://opus.nlpl.eu/) to train a base model, and then fine-tune it with a smaller dataset specific to your domain.

In this case, the author claims pretty good accuracy, almost on par with Google Brain's!

  On my test set of 3,000 sentences, the translator obtained a BLEU score of 0.39. This score is the benchmark scoring system used in machine translation, and the current best I could find in English to French is around 0.42 (set by some smart folks as Google Brain). So, not bad.

jeffreyrogers · on Oct 10, 2018

Wow, missed that part when I read it. Pretty incredible that using open source data you can outperform the state-of-the-art machine translators of a few years ago.

zawerf · on Oct 10, 2018

For a historical perspective check out stanford's nlp course: https://youtu.be/IxQtK2SjWWM?t=1267

Deep learning only started beating tradition methods in 2016!

saip · on Oct 13, 2017

A GAN style approach to learning and generating variants could be interesting as well. It could generate a couple of hundred plausible versions. Then you have another network that is trained to differentiate between fake and real colored photos which picks the best version.

saip · on Oct 13, 2017

Agreed. I imagine this has applications in compression as well. You could stream a movie (or a football game) in black and white and enable each device to color it on the spot. A similar technique could also be done for HD/3D/VR.

tzahola · on Oct 13, 2017

Or you can just broadcast the audio from the game and a neural net will synthesize the video on the fly. The possibilities are _endless_!

roywiggins · on Oct 13, 2017

Coloring football uniforms might be nearly impossible though...

zardo · on Oct 13, 2017

Yes, you provide a handful of full data keyframes and reconstruct the details of the stream from the middle out.

kiliankoe · on Oct 13, 2017

That middle out compression has some fantastic Weissman scores I believe.

mholmes680 · on Oct 13, 2017

that is an amazing idea.

saip · on Oct 10, 2017

Hey, thanks! I'm one of the co-founders at FloydHub.

Our focus has been on individual data scientists thus far. We'll be rolling out a teams plan soon since folks doing deep learning at work (with on-prem or cloud infra) tend to run into more or less the same difficulties as individuals. If you have any feedback, I'd love to hear :) sai (at) floydhub.com

saip · on Oct 2, 2017

FloydHub (YC W17) is building a Heroku for deep learning. We enable data scientists to do deep learning in the cloud with a few simple commands and without any of the infrastructure or DevOps hassles.

Every day, we handle training, scaling and serving of several thousand deep learning models on our GPU clusters and manage TBs of data. As an infrastructure engineer, you will be responsible for building and scaling our GPU cloud infrastructure. Not simply operating the system, but be part of architecting and building scalable and secure cloud infra.

Keywords: AWS/GCP/Azure, Kubernetes, Docker, Mesos, Terraform, Packer, Python.

Requisites: 3+ years of cloud infrastructure experience

We're a small team (4 engineers), agile and very early stage (YC W17). We're Stanford/CMU grads, with experience leading deep learning research at Bing/Microsoft, large engineering org (Location Labs, Avast) and infrastructure at LinkedIn. We're backed by some of the best VCs and angels in town (https://floydhub.com/about). If joining an awesome 4-person team doing AI/infra excites you, come join us! As a founding team member, you will have the opportunity to considerably impact not only the technical direction of the product, but also the culture of the company!

Email us: careers@floydhub.com or find more info at https://angel.co/floydhub/jobs/240935-senior-infrastructure-...

bharath28 · on Oct 2, 2017

If you are interested in deep learning and how the space is evolving, this is the place to be. Floydhub is working on tons of super interesting problems, the solutions to which will make DL algorithms and GPU cloud infrastructure accessible to everyone.

Disclosure: I am friends with the founders.

saip · on Sept 22, 2017

Likewise, any cloud infrastructure engineers in the Bay Area that might be looking, shoot me a mail - sai at floydhub dot com

saip · on Sept 18, 2017

DevOps is indeed a huge bottleneck in deep learning. Provisioning machines, installing drivers and packages and managing their dependency hell distracts focus from the core deep learning. At FloydHub (I'm a co-founder), we're building a zero-setup deep learning platform.

Spinning up a Jupyter notebook with Pytorch 0.2 is as simple as `floyd run --env pytorch-0.2 --mode jupyter`. All the steps you mention in your comment are automated.

DevOps hassles is, of course, just the first of many hurdles to doing effective deep learning. Experiment management, version control, reproducibility, sharing & collaboration, etc. are also other important problems.

saip · on Sept 2, 2017

FloydHub (YC W17) is building a Heroku for deep learning. Data scientists can train and deploy deep learning models in the cloud with a few simple commands and without any of the DevOps hassles. Instead of worrying about provisioning GPUs, installing drivers, and managing software dependencies, focus on what matters - the science itself.

We're small, agile and very early stage (YC W17). We're backed by some of the best VCs and angels in town. If joining a 4-person deep learning/infra startup excites you, come join our core founding team!

FloydHub | Senior Infrastructure Engineer | San Francisco, CA | Onsite, Full-time, Salary: 100k-125k (0.5%-1% equity), https://angel.co/floydhub/jobs/240935-senior-infrastructure-...

- Help us scale our massive GPU clusters and manage TBs of data. Expertise: AWS/Azure/GCP, Python, Docker, Kubernetes, Terraform

FloydHub | Growth and Product Engineer | San Francisco, CA | Onsite, Full-time, Salary: 75k-115k (0.25%-0.75% equity), https://angel.co/floydhub/jobs/245161-growth-and-product-eng...

- Help us grow 5x, 10x and 50x our current scale. You will lay foundations for our community building initiatives. Expertise: data driven growth mentality

FloydHub | Deep Learning Researcher and Writer | Remote | Full or Part-time, https://angel.co/floydhub/jobs/245449-ai-researcher-and-writ...

- Deep learning engineer/data scientist to help implement and write about the latest projects and research. Expertise: deep learning, writing

Email us: careers@floydhub.com

bharath28 · on Sept 3, 2017

I know from personal experience that they are working on very cool and important problems at the forefront of the deep learning space. Seriously, talk to them.

Disclosure: I am friends with the founders.

saip · on March 2, 2017

If you want to try out fastText without having to do any local setup, see https://github.com/floydhub/fastText.

FloydHub[1] is a deep learning PaaS for training and deploying DL models in the cloud with zero setup.

[1]https://www.floydhub.com Disclaimer: I am one of Floyd's co-creators

saip · on Feb 18, 2017

Hey! Glad you brought this up. We've gotten quite a few requests from students enrolled in Udacity courses and have been helping get them set up & run their class projects on Floyd.

Here's our instructions for the Self Driving Car Engineer nanodegree program: https://github.com/floydhub/CarND-Term1-Starter-Kit. Happy to do the same for your class as well! How can I reach you? Feel free to to mail us directly: founders@floydhub.com.

We've also reached out to folks at Udacity to see if we can offer any official support for the courses.

francogt · on Feb 18, 2017

Thanks for the reply! I just sent you guys an email. Will take a look at what you did for the Self Driving Car Nanodegree.