We talk with Oleg Tarasenko and Tze Yiing about crawling the web using Elixir. Oleg created the crawly project to help solve this problem and Tze Yiing joined him as a contributor and maintainer. We cover how Elixir is well suited to orchestrate crawling, how to deal with login pages, understanding the legal concerns, building a codeless scraper and much more!
Show Notes online - http://podcast.thinkingelixir.com/31
Elixir Community News
- https://dashbit.co/blog/ten-years-ish-of-elixir – January 9th marked the 10th year since the first commit to the Elixir repository
- https://github.com/elixir-lang/elixir/commit/337c3f2d569a42ebd5fcab6fef18c5e012f9be5b – First commit on the repository
- https://twitter.com/josevalim/status/1349010127270129670 – Jose Valim reveals the name of his secret project is called 'Nx'
- https://remote.com/blog/welcoming-elixir-creator-jose-valim – Jose Valim joins Remote as a Technical Adivsor
- https://twitter.com/josevalim/status/1347858475267854336 – ExUnit will catch SIGQUIT message from CTRL+\ and shows the tests that were running
- https://github.com/elixir-lang/elixir/blob/master/lib/mix/lib/mix/tasks/test.ex#L34 – ExUnit will print how much time the test suite spent on async tests vs sync tests
- https://twitter.com/fhunleth/status/1348092050487570433 – Nerves support on the M1 is looking good
- https://www.youtube.com/playlist?list=PLqj39LCvnOWZl_Pb0Y7wGWijKbTvL4gJg – Elixir Conf 2020 videos have all been publicly released!
- https://oltarasenko.medium.com/using-elixir-and-crawly-for-price-monitoring-7364d345fc64 – Using Elixir for price monitoring
- https://www.erlang-solutions.com/blog/web-scraping-with-elixir.html – Oleg's older web scraping with Elixir article
- https://www.erlang-solutions.com/blog/how-to-build-a-machine-learning-project-in-elixir.html – Building a machine learning projects with Elixir, Tensorflow and Crawly
- https://oltarasenko.medium.com/what-is-web-scraping-and-why-you-might-want-to-use-it-a0e4b621f6d0 – What is web scraping, and why you might want to use it?
- https://www.pillowskin.com – Ziinc's project using scraping and aggregation
- https://www.eff.org/deeplinks/2019/09/victory-ruling-hiq-v-linkedin-protects-scraping-public-data – EFF legal interpretation of LinkedIn vs HiQ scraping case
- https://hexdocs.pm/crawly/readme.html#quickstart – Crawly quickstart guid
- https://hexdocs.pm/crawly/tutorial.html – Crawley tutorial
- https://github.com/oltarasenko/crawly_ui – Crawly UI project
- http://crawlyui.com/ – Crawly UI project page
- Data is the new gold
- https://t.me/elixir_crawly – Crawley Telegram group
- https://github.com/oltarasenko – Oleg on Github
- https://oltarasenko.medium.com/ – Oleg's Blog
- https://twitter.com/tzeyiing – Lee TzeYiing on Twitter
- https://github.com/Ziinc – Lee TzeYiing on Github
- https://www.tzeyiing.com – Lee TzeYiing Blog
Find us online