diff --git a/pydata-london-2022/category.json b/pydata-london-2022/category.json new file mode 100644 index 000000000..ae574bbb5 --- /dev/null +++ b/pydata-london-2022/category.json @@ -0,0 +1,3 @@ +{ + "title": "PyData London 2022" +} diff --git a/pydata-london-2022/videos/ade-idowu-document-sentence-similarity-solution.json b/pydata-london-2022/videos/ade-idowu-document-sentence-similarity-solution.json new file mode 100644 index 000000000..0c361c9da --- /dev/null +++ b/pydata-london-2022/videos/ade-idowu-document-sentence-similarity-solution.json @@ -0,0 +1,47 @@ +{ + "description": "Ade Idowu Presents:\n\nDocument/Sentence Similarity Solution Using Open Source NLP Libraries, Frameworks and Datasets\n\nThe need to develop robust document/text similarity measure solutions is an essential step for building applications such as Recommendation Systems, Search Engines, Information Retrieval Systems including other ML/AI applications such as News Aggregators or Automated Recruitment systems used to match CVs to job specification and so on. In general, text similarity is the measure of how words/tokens, tweets, phrases, sentences, paragraphs and entire documents are lexically and\u202fsemantically close to each other. Texts/words are lexically similar if\u202fthey\u202fhave similar character sequence or structure and, are semantically similar if they have the same meaning, describe similar concepts and they are used in the same context.\u202f\u202f\n\nThis tutorial will demonstrate a number of strategies for feature extraction i.e., transforming documents to numeric feature vectors. This transformation step is a prerequisite for computing the similarity between documents. Typically, each strategy will involve 4 steps, namely: 1) the use of standard natural language pre-processing techniques to prepare/clean the documents, 2) the transformation of the document text into numeric vectors/embeddings, 3) calculation of document similarity using metrics such as Cosine, Euclidean and Jaccard and, 4) validation of the findings\n\nGithub Repo: https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...\"", + "duration": 5302, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models", + "url": "https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/qXcRW5fIa1g/maxresdefault.jpg", + "title": "Ade Idowu - Document/Sentence Similarity Solution", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=qXcRW5fIa1g" + } + ] +} diff --git a/pydata-london-2022/videos/adrin-jalali-questions-and-practices-to-make-algorithmic-decision-making-more-fair.json b/pydata-london-2022/videos/adrin-jalali-questions-and-practices-to-make-algorithmic-decision-making-more-fair.json new file mode 100644 index 000000000..6a80596dd --- /dev/null +++ b/pydata-london-2022/videos/adrin-jalali-questions-and-practices-to-make-algorithmic-decision-making-more-fair.json @@ -0,0 +1,43 @@ +{ + "description": "Adrin Jalali Presents:\n\nMeasurement and Fairness: Questions and Practices to Make Algorithmic Decision Making More Fair\n\nMachine learning is almost always used in systems which automate or semi-automate decision making processes. These decisions are used in recommender systems, fraud detection, healthcare recommendation systems, etc. Many systems, if not most, can induce harm by giving a less desirable outcome for cases where they should in fact give a more desired outcome, e.g. reporting an insurance claim to be fraud when indeed it is not.\n\nIn this talk we first go through different sources of harm which can creep into a system based on machine learning, and the types of harm an ML based system can induce.\n\nTaking lessons from social sciences, one can see input and output values of automated systems as measurements of constructs or a proxy measurement of those constructs. In this talk we go through a set of questions one should ask before and while working on such systems. Some of these questions can be answered quantitatively, and others qualitatively.\n\n\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 2418, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/9uLDyK8jKYc/maxresdefault.jpg", + "title": "Adrin Jalali - Questions and Practices to Make Algorithmic Decision Making More Fair", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=9uLDyK8jKYc" + } + ] +} diff --git a/pydata-london-2022/videos/ahmet-melek-what-is-x-up-to-ner-and-relationship-extraction-for-information-extraction.json b/pydata-london-2022/videos/ahmet-melek-what-is-x-up-to-ner-and-relationship-extraction-for-information-extraction.json new file mode 100644 index 000000000..defd74c44 --- /dev/null +++ b/pydata-london-2022/videos/ahmet-melek-what-is-x-up-to-ner-and-relationship-extraction-for-information-extraction.json @@ -0,0 +1,51 @@ +{ + "description": "Ahmet Melek Presents:\n\nWhat is X up to? - NER and Relationship Extraction for Information Extraction\n\nDealing with unstructured text to obtain information is one of the biggest aims in the field of natural language processing. In this talk, we will be demoing a solution where we have unstructured text on a particular topic, and we apply named entity recognition, together with relationship extraction, to extract structured data. We will be introducing our data source, the models that we use, and will be inspecting the end results, viewing particular statistics, and hovering over a graph, extracted from the raw text.\n\nGithub: https://github.com/ahmetmeleq/PyData2022_NER_RelEx\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 1912, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/ahmetmeleq/PyData2022_NER_RelEx", + "url": "https://github.com/ahmetmeleq/PyData2022_NER_RelEx" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/nO59pdwWELA/maxresdefault.jpg", + "title": "Ahmet Melek - What is X up to? - NER and Relationship Extraction for Information Extraction", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=nO59pdwWELA" + } + ] +} diff --git a/pydata-london-2022/videos/alejandro-saucedo-accelerating-machine-learning-at-scale-with-huggingface-optimum-and-seldon.json b/pydata-london-2022/videos/alejandro-saucedo-accelerating-machine-learning-at-scale-with-huggingface-optimum-and-seldon.json new file mode 100644 index 000000000..2aeb2d7bc --- /dev/null +++ b/pydata-london-2022/videos/alejandro-saucedo-accelerating-machine-learning-at-scale-with-huggingface-optimum-and-seldon.json @@ -0,0 +1,43 @@ +{ + "description": "Alejandro Saucedo Presents:\n\nAccelerating High-Performance Machine Learning with HuggingFace, Optimum & Seldon\n\nIdentifying the right tools for high performance production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session showcase how practitioners can productionise ML models in scalable ecosystems in an optimizable way without having to deal with the underlying infrastructure challenges. Saucedo takes a GPT-2 HuggingFace model, optimizing it with ONNX and deploying to MLServer at scale using Seldon.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2188, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/BQ8NrdkiE44/maxresdefault.jpg", + "title": "Alejandro Saucedo - Accelerating Machine Learning at Scale with HuggingFace, Optimum and Seldon", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=BQ8NrdkiE44" + } + ] +} diff --git a/pydata-london-2022/videos/alexander-hendorf-lessons-learned-about-data-ai-at-enterprises-and-smes-pydata-london-2022.json b/pydata-london-2022/videos/alexander-hendorf-lessons-learned-about-data-ai-at-enterprises-and-smes-pydata-london-2022.json new file mode 100644 index 000000000..a309e64ca --- /dev/null +++ b/pydata-london-2022/videos/alexander-hendorf-lessons-learned-about-data-ai-at-enterprises-and-smes-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Alexander Hendorf presents:\n\nLessons Learned About Data & AI at Enterprises and SMEs\n\nAll one needs is strategy, skill and resources to make digitalization and AI happen. So why is everything taking so long? Shouldn\u2019t you all be finished yesterday already? An honest talk about how to address the complexity of making data and AI happen in enterprises.\n\nMany incumbents are transitioning to new technologies while their businesses operate on systems that are years or decades old. Introducing new technologies is not just about introducing Open Source or introducing community culture or working agile or SCRUM or explaining complicated technology stuff to executives. The truth is: it requires all of it and likely even more. Mastering innovation requires having many balls in the air at once.\n\nIn this talk I'll present a transformation use case of an established player including our best practices and anti-patterns.\n\nWe will discuss the following aspects:\n- From idea to strategy\n- Assessing the status quo\n- Introducing Python and Open Source and what to use (or not)\n- Legacy is in the the house, still\n- Getting all departments on the same page\n- Introducing a community-driven collaborative culture\n\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf\n\nwww.pydata.org \n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2321, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/Bp3pUSZ6DpU/maxresdefault.jpg", + "title": "Alexander Hendorf - Lessons Learned About Data & AI at Enterprises and SMEs | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=Bp3pUSZ6DpU" + } + ] +} diff --git a/pydata-london-2022/videos/anders-bogsnes-sqlalchemy-and-you-making-sql-the-best-thing-since-sliced-bread.json b/pydata-london-2022/videos/anders-bogsnes-sqlalchemy-and-you-making-sql-the-best-thing-since-sliced-bread.json new file mode 100644 index 000000000..1373609ff --- /dev/null +++ b/pydata-london-2022/videos/anders-bogsnes-sqlalchemy-and-you-making-sql-the-best-thing-since-sliced-bread.json @@ -0,0 +1,47 @@ +{ + "description": "Anders Bogsnes presents: \n\nSQLAlchemy and you - making SQL the best thing since sliced bread\n\nAre you writing SQL strings in your code? Have you only used ORMs and want to start getting more control over your SQL?\n\nSQLAlchemy is the gold-standard for working with SQL in Python and this tutorial will get you comfortable with working in it so you can take advantage of its power. We will go through Core and ORM abstractions so you'll be comfortable navigating through the different layers and be able to fully use the power of Python when writing your SQL \n\nGithub Repo: https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.\n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5302, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial", + "url": "https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/X4-hu3vZAOg/maxresdefault.jpg", + "title": "Anders Bogsnes - SQLAlchemy and You - Making SQL the Best Thing Since Sliced Bread", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=X4-hu3vZAOg" + } + ] +} diff --git a/pydata-london-2022/videos/asya-frumkin-can-you-read-this-how-i-improved-text-readability-online-for-visually-impaired.json b/pydata-london-2022/videos/asya-frumkin-can-you-read-this-how-i-improved-text-readability-online-for-visually-impaired.json new file mode 100644 index 000000000..d5e642b6f --- /dev/null +++ b/pydata-london-2022/videos/asya-frumkin-can-you-read-this-how-i-improved-text-readability-online-for-visually-impaired.json @@ -0,0 +1,43 @@ +{ + "description": "Asya Frumkin Presents:\n\nCan You Read This? (Or: How I Improved Text Readability on the Web for the Visually Impaired)\n\nThis talk will describe how Asya Frumkin used deep learning to identify texts on background images that are illegible for people with vision impairments. Frumkin explains the challenges ncountered when using different OCR architectures for this task, and talks about the original solution she came up with. \n\nThe web is a visual amusement park, full of colorful images, playful gifs, and funny videos. But when that visual richness is combined with text, people with low vision or color blindness increasingly struggle with the basic function of reading and thus can\u2019t fully enjoy what the web has to offer. In this talk, Frumkin dives into the science of colors and how different people perceive them. She discusses how color contrast affects text readability, and explains the challenge of estimating color contrast for text on image. Frumkin also explores how modern deep learning tools such as OCR and text detection can be used for text classification based on readability. In addition, Frumkin shares her personal experience of working with visual data whilst having a visual impairment. \n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2211, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/HYDm9O-KTuU/maxresdefault.jpg", + "title": "Asya Frumkin - Can you Read This? How I Improved Text Readability Online for Visually Impaired", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=HYDm9O-KTuU" + } + ] +} diff --git a/pydata-london-2022/videos/bartczak-klimont-beyond-medical-image-segmentation-the-road-towards-clinical-insights.json b/pydata-london-2022/videos/bartczak-klimont-beyond-medical-image-segmentation-the-road-towards-clinical-insights.json new file mode 100644 index 000000000..765612d1f --- /dev/null +++ b/pydata-london-2022/videos/bartczak-klimont-beyond-medical-image-segmentation-the-road-towards-clinical-insights.json @@ -0,0 +1,43 @@ +{ + "description": "Adam Klimont and Tomasz Bartczak Present:\n\nBeyond Medical Image Segmentation. The Road Towards Cinical Insights\n\nRecent progress in deep learning for medical imaging has led to impressive results. Among them is a fully automatic human organ segmentation from Computed Tomography (CT) scans. Organ segmentation can be the end goal in itself, e.g. when it is directly viewed by clinical teams. It can also serve as an input to diagnostic aid tools. Moreover, specific knowledge can be extracted out of segmentations to build databases. These databases can then be used for reasoning about the anatomy or planning treatment.\n\nIn this talk, Klimont and Bartczak describe a multi-stage pipeline for processing CT scans for abdominal aortic aneurysm (AAA) treatment planning. They share their experience in sub-organ multilabel segmentation. Klimont and Bartzak discuss the challenges with common loss functions, and with metrics not being well aligned with clinical significance. The pair will show how enhanced segmentation can be used to represent patient anatomy in an accessible way for end-users who plan treatment for new patients.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2152, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/HoARFQQA83I/maxresdefault.jpg", + "title": "Bartczak, Klimont - Beyond Medical Image Segmentation. The Road Towards Clinical Insights", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=HoARFQQA83I" + } + ] +} diff --git a/pydata-london-2022/videos/cheuk-ting-ho-picking-what-to-watch-next-build-a-recommendation-system-pydata-london-2022.json b/pydata-london-2022/videos/cheuk-ting-ho-picking-what-to-watch-next-build-a-recommendation-system-pydata-london-2022.json new file mode 100644 index 000000000..5769c8e6c --- /dev/null +++ b/pydata-london-2022/videos/cheuk-ting-ho-picking-what-to-watch-next-build-a-recommendation-system-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Cheuk Ting Ho presents: \n\nPicking What to Watch Next: Build a Recommendation System\n\nRecommendation algorithms are the driving force of many businesses: e-commerce, personalized advertisement, on-demand entertainment. Computer algorithms know what you like and present you with things that are customized for you. Here we will explore how to do that by building a system ourselves.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 1957, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/JYZxiBcmL-s/maxresdefault.jpg", + "title": "Cheuk Ting Ho - Picking What to Watch Next: Build a Recommendation System | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=JYZxiBcmL-s" + } + ] +} diff --git a/pydata-london-2022/videos/chris-fonnesbeck-probabilistic-python-an-introduction-to-bayesian-modeling-with-pymc.json b/pydata-london-2022/videos/chris-fonnesbeck-probabilistic-python-an-introduction-to-bayesian-modeling-with-pymc.json new file mode 100644 index 000000000..f847ee64b --- /dev/null +++ b/pydata-london-2022/videos/chris-fonnesbeck-probabilistic-python-an-introduction-to-bayesian-modeling-with-pymc.json @@ -0,0 +1,39 @@ +{ + "description": "Chris Fonnesbeck presents:\n\nProbabilistic Python: An Introduction to Bayesian Modeling with PyMC\n\nBayesian statistical methods offer a powerful set of tools to tackle a wide variety of data science problems. In addition, the Bayesian approach generates results that are easy to interpret and automatically account for uncertainty in quantities that we wish to estimate and predict. Historically, computational challenges have been a barrier, particularly to new users, but there now exists a mature set of probabilistic programming tools that are both capable and easy to learn. We will use the newest release of PyMC (version 4) in this tutorial, but the concepts and approaches that will be taught are portable to any probabilistic programming framework.\n\nThis tutorial is intended for practicing and aspiring data scientists and analysts looking to learn how to apply Bayesian statistics and probabilistic programming to their work. It will provide learners with a high-level understanding of Bayesian statistical methods and their potential for use in a variety of applications. They will also gain hands-on experience with applying these methods using PyMC, specifically including the specification, fitting and checking of models applied to a couple of real-world datasets.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n0:08 Introduction\n1:19 Probabilistic programming\n1:53 Stochastic language \u201dprimitives\u201d\n3:06 Bayesian inference\n3:27 What is Bayes?\n3:57 Inverse probability\n4:39 Why Bayes\n5:13 The Bayes formula\n4:21 Stochastic programs\n6:51 Prior distribution\n8:12 Likelihood function\n8:29 Normal distribution\n8:53 Binomial distribution\n9:14 Poisson distribution\n9:32 Infer values for latent variables\n9:54 Posterior distribution\n9:47 Probabilistic programming abstracts the inference procedure\n10:56 Bayes by hand\n12:18 Conjugacy\n16:43 Probabilistic programming in Python\n17:24 PyMC and its features\n19:15 Question: Among the different probabilistic programming libraries, is there a difference in what they have to offer?\n20:39 Question: How can one know which likelihood distribution to choose?\n21:35 Question: Is there a methodology used to specify the likelihood distribution?\n22:30 Example: Building models in PyMC\n27:31 Stochastic and deterministic variables\n37:11 Observed Random Variables\n41:00 Question: To what extent are the features of PyMC supported if compiled in different backends?\n41:47 Markov Chain Monte Carlo and Bayesian approximation\n43:04 Markov chains\n44:19 Reversible Markov chains\n45:06 Metropolis sampling\n48:00 Hamiltonian Monte Carlo\n49:10 Hamiltonian dynamics\n50:49 No U-turn Sampler (NUTS)\n52:11 Question: How do you know the number of leap frog steps to take?\n52:55 Example: Markov Chain Monte Carlo in PyMC\n1:13:30 Divergences and how to deal with them\n1:15:08 Bayesian Fraction of Missing Information\n1:16:25 Potential Scale Reduction\n1:17:57 Goodness of fit\n1:22:40 Intuitive Bayes course\n1:23:09 Question: Do bookmakers use PyMC or Bayesian methods?\n1:23:53 Question: How does it work if you have different samplers for different variables?\n1:25:09 Question: What route should one take in case of data with many discrete variables and many possible values?\n1:25:39 Question: Is there a natural way to use PyMC over a cluster of CPUs?", + "duration": 5194, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/911d4A1U0BE/maxresdefault.jpg", + "title": "Chris Fonnesbeck - Probabilistic Python: An Introduction to Bayesian Modeling with PyMC", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=911d4A1U0BE" + } + ] +} diff --git a/pydata-london-2022/videos/cooper-and-hall-making-fake-data-generators-for-open-source-healthcare-data-science-projects.json b/pydata-london-2022/videos/cooper-and-hall-making-fake-data-generators-for-open-source-healthcare-data-science-projects.json new file mode 100644 index 000000000..74f6b3d52 --- /dev/null +++ b/pydata-london-2022/videos/cooper-and-hall-making-fake-data-generators-for-open-source-healthcare-data-science-projects.json @@ -0,0 +1,47 @@ +{ + "description": "Matthew Cooper and Jennifer Hall Present:\n\nMaking Fake Data Generators for Open Source Healthcare Data Science Projects\n\nGithub: https://github.com/nhsx/\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2342, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/nhsx/", + "url": "https://github.com/nhsx/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/vDWx2_e6nRU/maxresdefault.jpg", + "title": "Cooper and Hall - Making Fake Data Generators for Open Source Healthcare Data Science Projects", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=vDWx2_e6nRU" + } + ] +} diff --git a/pydata-london-2022/videos/data-pipelining-for-real-time-ml-models-pydata-london-2022.json b/pydata-london-2022/videos/data-pipelining-for-real-time-ml-models-pydata-london-2022.json new file mode 100644 index 000000000..db2e7dde3 --- /dev/null +++ b/pydata-london-2022/videos/data-pipelining-for-real-time-ml-models-pydata-london-2022.json @@ -0,0 +1,39 @@ +{ + "description": "Gabor Bakos Presents: \n\nData Pipelining for Real-time ML Models\n\nReinventing the wheel is usually not something we should be striving for, so why did we build our data pipeline from scratch? There are numerous design choices people make and they can highly affect the potential use cases. When making a custom pipeline you can make your own trade-offs between speed, throughput, simplicity and consistency of code/logic/data.\n\nMarket makers like Optiver are usually associated with ultra-low latency infrastructure, however there are plenty of use cases where human latency (seconds) is acceptable. Computing derived metrics, training models and making predictions as new data arrives are just a few such applications and what we will focus on in this presentation.\n\nwww.pydata.org", + "duration": 2316, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/8cN9gM9aBjA/maxresdefault.jpg", + "title": "Data Pipelining for Real-time ML Models | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=8cN9gM9aBjA" + } + ] +} diff --git a/pydata-london-2022/videos/davide-frazzetto-a-hitchhiker-guide-to-mlops-pydata-london-2022.json b/pydata-london-2022/videos/davide-frazzetto-a-hitchhiker-guide-to-mlops-pydata-london-2022.json new file mode 100644 index 000000000..57f844077 --- /dev/null +++ b/pydata-london-2022/videos/davide-frazzetto-a-hitchhiker-guide-to-mlops-pydata-london-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Davide Frazzetto Presents:\n\nA Hitchhiker Guide to MLOps\n\nBringing Machine Learning (ML) applications to a live production phase comes with all the same challenges of traditional software development, and more. Examples are: large datasets, tracking data quality and models quality, experiments reproducibility, and monitoring a live application. This talk is a grounded introduction to monitoring the ML lifecycle with only open source software.\n\nSlides: https://github.com/dave-frazzetto/hitchhiker-guide-mlops\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2417, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/dave-frazzetto/hitchhiker-guide-mlops", + "url": "https://github.com/dave-frazzetto/hitchhiker-guide-mlops" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/_M0sLnZTLog/maxresdefault.jpg", + "title": "Davide Frazzetto - A Hitchhiker Guide to MLOps | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=_M0sLnZTLog" + } + ] +} diff --git a/pydata-london-2022/videos/dillon-gardner-auc-is-worthless-lessons-to-transition-from-academic-to-business-data-science.json b/pydata-london-2022/videos/dillon-gardner-auc-is-worthless-lessons-to-transition-from-academic-to-business-data-science.json new file mode 100644 index 000000000..042e97261 --- /dev/null +++ b/pydata-london-2022/videos/dillon-gardner-auc-is-worthless-lessons-to-transition-from-academic-to-business-data-science.json @@ -0,0 +1,47 @@ +{ + "description": "Dillon Gardner Presents:\n\nAUC is Worthless: Lessons in Transitioning From Academic to Business Data Science\n\nNew data scientists often struggle to make major impacts on solving business problems despite impressive technical skills. A core challenge is the gap between how academics think about performance of models and what matters for a company. As an example, academic work summarizes a model\u2019s receiver operator characteristic (ROC) curve with the area under the curve (AUC). This summary statistic is useless for business applications, which will always have unique trade-offs and constraints. Effective approaches to optimize model performance requires understanding the specific business requirements and how to map that to a well framed data science problem.\n\nIn this talk, I will go through a framework of how to think effectively about model trade-offs in terms of maximizing business utility. Through this exercise, we will build intuition for what is required for a model in production to be a success and how to collaborate more effectively with non-technical co-workers.\n\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/AUC_is_Worthless_DR_Gardner.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2420, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/AUC_is_Worthless_DR_Gardner.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/AUC_is_Worthless_DR_Gardner.pdf" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/aTJPvfQ_fLY/maxresdefault.jpg", + "title": "Dillon Gardner - AUC is Worthless: Lessons to Transition From Academic to Business Data Science", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=aTJPvfQ_fLY" + } + ] +} diff --git a/pydata-london-2022/videos/dimachkie-kahlow-kernes-zintchenko-understanding-your-bank-statement-in-100ms.json b/pydata-london-2022/videos/dimachkie-kahlow-kernes-zintchenko-understanding-your-bank-statement-in-100ms.json new file mode 100644 index 000000000..e957bca2f --- /dev/null +++ b/pydata-london-2022/videos/dimachkie-kahlow-kernes-zintchenko-understanding-your-bank-statement-in-100ms.json @@ -0,0 +1,43 @@ +{ + "description": "Chady Dimachkie, Robin Kahlow, Dr. Jonathan Kernes and Dr. Ilia Zintchenko Present:\n\nUnderstanding Your Bank Statement in 100ms\n\nIn the last year, the global number of fintech companies has nearly doubled. Yet, despite the rapid growth, there is one area of banking that has been notoriously difficult to modernize: financial transactions. More than 1 billion transactions occur every day around the world. Transactions are different in every country and language, require knowledge of every merchant and location, depend on the context of the surrounding parties involved and are specific for each use case. At Ntropy, we enable developers to parse financial transactions in under 100ms with super-human accuracy, unlocking the path to a new generation of autonomous finance, powering products and services that have never before been possible. We will for the first time discuss the key parts of our pipeline, made possible by the latest advancements in natural language understanding and unsupervised learning.\n\nThis talk is geared at practitioners interested in knowing how bank transactions can be understood by a machine. Expert knowledge of machine learning is not required.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 2164, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/r-O1jdG1kuA/maxresdefault.jpg", + "title": "Dimachkie, Kahlow, Kernes, & Zintchenko - Understanding your Bank Statement in 100ms", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=r-O1jdG1kuA" + } + ] +} diff --git a/pydata-london-2022/videos/dr-susan-mulcahy-keynote-lighthearted-thoughts-on-conversing-about-data-pydata-london-2022.json b/pydata-london-2022/videos/dr-susan-mulcahy-keynote-lighthearted-thoughts-on-conversing-about-data-pydata-london-2022.json new file mode 100644 index 000000000..cf7deb279 --- /dev/null +++ b/pydata-london-2022/videos/dr-susan-mulcahy-keynote-lighthearted-thoughts-on-conversing-about-data-pydata-london-2022.json @@ -0,0 +1,39 @@ +{ + "description": "Dr. Susan Mulcahy Presents:\n\nKeynote: Lighthearted Thoughts on Conversing About Data\n\nBy bringing more of us into the conversation on data, we can represent an array of colleagues and specialisms, not all of which are technical. We\u2019ll look at some examples of breaking down your message to its essential components in order to bridge the gap of differing specialisms. And of course, with some lighthearted points woven into the talk for good measure.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2029, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/62ylM6-9lh0/maxresdefault.jpg", + "title": "Dr. Susan Mulcahy - Keynote: Lighthearted Thoughts on Conversing About Data | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=62ylM6-9lh0" + } + ] +} diff --git a/pydata-london-2022/videos/eyal-kazin-hypothesis-testing-stop-criterion-with-precision-is-the-goal-pydata-london-2022.json b/pydata-london-2022/videos/eyal-kazin-hypothesis-testing-stop-criterion-with-precision-is-the-goal-pydata-london-2022.json new file mode 100644 index 000000000..ae668a01f --- /dev/null +++ b/pydata-london-2022/videos/eyal-kazin-hypothesis-testing-stop-criterion-with-precision-is-the-goal-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Eyal Kazin Presents:\n\nDon't Stop 'til You Get Enough - Hypothesis Testing Stop Criterion with \u201cPrecision Is The Goal\u201d\n\nIn hypothesis testing the stopping criterion for data collection is a non-trivial question that puzzles many analysts. This is especially true with sequential testing where demands for quick results may lead to biassed ones. Kazin shows how the belief that Bayesian approaches magically resolve this issue is misleading and how to obtain reliable outcomes by focusing on sample precision as a goal.\n\nSlides: https://bit.ly/precision-goal\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2603, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://bit.ly/precision-goal", + "url": "https://bit.ly/precision-goal" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/_j3Q1AblY44/maxresdefault.jpg", + "title": "Eyal Kazin - Hypothesis Testing Stop Criterion with \"Precision Is The Goal\" | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=_j3Q1AblY44" + } + ] +} diff --git a/pydata-london-2022/videos/franz-kiraly-sktime-python-toolbox-for-time-series-how-to-implement-your-own-estimator.json b/pydata-london-2022/videos/franz-kiraly-sktime-python-toolbox-for-time-series-how-to-implement-your-own-estimator.json new file mode 100644 index 000000000..4845f33af --- /dev/null +++ b/pydata-london-2022/videos/franz-kiraly-sktime-python-toolbox-for-time-series-how-to-implement-your-own-estimator.json @@ -0,0 +1,51 @@ +{ + "description": "Franz Kiraly presents: \n\nSktime - Python Toolbox for Time Series: How to Implement Your Own Estimator\n\nSktime is a widely used scikit-learn compatible library for learning with time series. Sktime is easily extensible by anyone, and interoperable with the PyData/NumFOCUS stack. This tutorial explains how to write your own Sktime estimator. E.g., forecaster, classifier, transformer, by using Sktime\u2019s extension templates and testing framework. A custom estimator can live in any local code base, and will be compatible with Sktime pipelines, or scikit-learn. A continuation of the Sktime introductory tutorial at PyData. https://www.youtube.com/watch?v=ODspi8-uWgo \n\nGithub Repo: https://github.com/sktime/sktime-tutorial-pydata-global-2021\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.\n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5183, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://www.youtube.com/watch?v=ODspi8-uWgo", + "url": "https://www.youtube.com/watch?v=ODspi8-uWgo" + }, + { + "label": "https://github.com/sktime/sktime-tutorial-pydata-global-2021", + "url": "https://github.com/sktime/sktime-tutorial-pydata-global-2021" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/S_3ewcvs_pg/maxresdefault.jpg", + "title": "Franz Kiraly - Sktime\u2014Python Toolbox for Time Series: How to Implement Your Own Estimator", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=S_3ewcvs_pg" + } + ] +} diff --git a/pydata-london-2022/videos/hanna-van-der-vlis-clusterf-ck-a-practical-guide-to-bayesian-hierarchical-modeling-in-pymc3.json b/pydata-london-2022/videos/hanna-van-der-vlis-clusterf-ck-a-practical-guide-to-bayesian-hierarchical-modeling-in-pymc3.json new file mode 100644 index 000000000..26d8a19fa --- /dev/null +++ b/pydata-london-2022/videos/hanna-van-der-vlis-clusterf-ck-a-practical-guide-to-bayesian-hierarchical-modeling-in-pymc3.json @@ -0,0 +1,47 @@ +{ + "description": "Hanna van der Vlis Presents:\n\nClusterf*ck: A Practical Guide to Bayesian Hierarchical Modeling in PyMC3\n\nAt Apollo Agriculture, a Kenya based agro-tech startup, one of the challenging problems we face is to predict yields of Kenyan maize farmers. Like almost all data-sets, this data-set has a hierarchical structure: farmers within the same region aren\u2019t independent. By ignoring this fact, a model could predict yields entirely from the region of the farmer, but fails to find any other meaningful insights, and we may not even realize. However, if we \u201covercorrected,\u201d treating each region as completely separate, each individual analysis could be underpowered. Enter the hero of our story: Bayesian hierarchical modeling. Using a practical example in Pymc3, we\u2019ll follow this hero as they identify and overcome clustered data-sets.\n\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/Clusterf_ck_-A-practical-guide-to-Bayesian-hierarchical-modeling-in-Pymc3.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2149, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/Clusterf_ck_-A-practical-guide-to-Bayesian-hierarchical-modeling-in-Pymc3.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/Clusterf_ck_-A-practical-guide-to-Bayesian-hierarchical-modeling-in-Pymc3.pdf" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/pbcxb9xpTBI/maxresdefault.jpg", + "title": "Hanna van der Vlis - Clusterf*ck: A Practical Guide to Bayesian Hierarchical Modeling in PyMC3", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=pbcxb9xpTBI" + } + ] +} diff --git a/pydata-london-2022/videos/ian-ozsvald-building-successful-data-science-projects-pydata-london-2022.json b/pydata-london-2022/videos/ian-ozsvald-building-successful-data-science-projects-pydata-london-2022.json new file mode 100644 index 000000000..74b972e77 --- /dev/null +++ b/pydata-london-2022/videos/ian-ozsvald-building-successful-data-science-projects-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Ian Ozsvald Presents:\n\nBuilding Successful Data Science Projects\n\nYour data science projects haven't worked out so well - maybe you didn't have a plan, you suffered from surprising unknowns or you couldn't deliver what someone else promised. Ozsvald shares both some painful past experiences and explains choices that will increase your success. Ozsvlad roots this in a recently shipped solution worth $1 million for a client.\n\nSlides: https://speakerdeck.com/ianozsvald/building-successful-data-science-projects\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2425, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://speakerdeck.com/ianozsvald/building-successful-data-science-projects", + "url": "https://speakerdeck.com/ianozsvald/building-successful-data-science-projects" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/X7E916dfHuE/maxresdefault.jpg", + "title": "Ian Ozsvald - Building Successful Data Science Projects | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=X7E916dfHuE" + } + ] +} diff --git a/pydata-london-2022/videos/james-powell-whatever-i-can-do-to-entertain-you-in-30-minutes-pydata-london-2022.json b/pydata-london-2022/videos/james-powell-whatever-i-can-do-to-entertain-you-in-30-minutes-pydata-london-2022.json new file mode 100644 index 000000000..028fdfbaf --- /dev/null +++ b/pydata-london-2022/videos/james-powell-whatever-i-can-do-to-entertain-you-in-30-minutes-pydata-london-2022.json @@ -0,0 +1,39 @@ +{ + "description": "James Powell speaking at PyData 2022\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 1577, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/-z2eqLwVmzw/maxresdefault.jpg", + "title": "James Powell - Whatever I Can do to Entertain You in 30 Minutes | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=-z2eqLwVmzw" + } + ] +} diff --git a/pydata-london-2022/videos/jay-alammar-large-language-models-for-real-world-applications-a-gentle-intro.json b/pydata-london-2022/videos/jay-alammar-large-language-models-for-real-world-applications-a-gentle-intro.json new file mode 100644 index 000000000..03ec627cb --- /dev/null +++ b/pydata-london-2022/videos/jay-alammar-large-language-models-for-real-world-applications-a-gentle-intro.json @@ -0,0 +1,43 @@ +{ + "description": "Jay Alammar Presents:\n\nLarge Language Models for Real-World Applications - A Gentle Intro\n\nMachine language understanding and generation has been undergoing rapid improvements due to recent breakthroughs in machine learning (e.g. large language models like GPT and BERT). And while big tech and NLP engineers were quick to capitalize on these models, the broader developer community lags in adopting these models and realizing their potential in their business domains.\n\nThis talk provides a gentle and highly visual overview of some of the main intuitions and real-world applications of large language models. It assumes no prior knowledge of language processing and aims to bring attendees up to date with the fundamental intuitions and applications of large language models.\n\nwww.pydata.org\n\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2177, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/xSGX8gBQDO8/maxresdefault.jpg", + "title": "Jay Alammar - Large Language Models for Real-World Applications - A Gentle Intro", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=xSGX8gBQDO8" + } + ] +} diff --git a/pydata-london-2022/videos/jim-dowling-python-centric-feature-stores-pydata-london-2022.json b/pydata-london-2022/videos/jim-dowling-python-centric-feature-stores-pydata-london-2022.json new file mode 100644 index 000000000..0dc1b416e --- /dev/null +++ b/pydata-london-2022/videos/jim-dowling-python-centric-feature-stores-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Jim Dowling Presents:\n\nPython-centric Feature Stores\n\nMost enterprise data used by Data Scientists to train machine learning models is tabular data that comes from data warehouses and data lakes. Recent growth in the popularity of the modern data stack, based on lakehouses like Snowflake, Delta Lake, Big Query, and Redshift, have led to growth in the use of SQL-centric tools for data engineers, such as DBT. However, Data Scientists' language of choice is Python. How do we square this circle?\n\nIn this talk, Jim Dowling investigates the role of the Feature Store for machine learning in enabling Python native access to enterprise data for both training and serving features to models. In particular, Dowling describes the problem of how to create point-in-time consistent training data from features spread over many tables using a SQL backend from Python. \n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2549, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/AIof4woJSkY/maxresdefault.jpg", + "title": "Jim Dowling - Python-centric Feature Stores | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=AIof4woJSkY" + } + ] +} diff --git a/pydata-london-2022/videos/jon-bannister-notebooker-production-and-scheduling-for-your-jupyter-notebooks.json b/pydata-london-2022/videos/jon-bannister-notebooker-production-and-scheduling-for-your-jupyter-notebooks.json new file mode 100644 index 000000000..4b678a54f --- /dev/null +++ b/pydata-london-2022/videos/jon-bannister-notebooker-production-and-scheduling-for-your-jupyter-notebooks.json @@ -0,0 +1,51 @@ +{ + "description": "Jon Bannister Presents:\n\nNotebooker: Production and Scheduling for your Jupyter Notebooks\n\nNotebooker is an open-source web-based mongo-backed application which can help you turn your Jupyter Notebooks into reports which can be parametrised, scheduled, and shared in a few clicks. In this talk, I introduce Notebooker, how it works, and how it can help you.\n\nGithub: https://github.com/man-group/notebooker\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/notebooker-pydata-2022.pptx\n\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2061, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/man-group/notebooker", + "url": "https://github.com/man-group/notebooker" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/notebooker-pydata-2022.pptx", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/notebooker-pydata-2022.pptx" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/3ZNlSkytueA/maxresdefault.jpg", + "title": "Jon Bannister - Notebooker: Production and Scheduling for your Jupyter Notebooks", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=3ZNlSkytueA" + } + ] +} diff --git a/pydata-london-2022/videos/juan-luis-cano-rodriguez-beyond-pandas-the-great-python-dataframe-showdown-pydata-london-2022.json b/pydata-london-2022/videos/juan-luis-cano-rodriguez-beyond-pandas-the-great-python-dataframe-showdown-pydata-london-2022.json new file mode 100644 index 000000000..c279914f8 --- /dev/null +++ b/pydata-london-2022/videos/juan-luis-cano-rodriguez-beyond-pandas-the-great-python-dataframe-showdown-pydata-london-2022.json @@ -0,0 +1,51 @@ +{ + "description": "Juan Luis Cano Rodr\u00edguez Presents:\n\nBeyond Pandas: The Great Python Dataframe Showdown\n\nThe pandas library is one of the key factors that enabled the growth of Python in the Data Science industry and continues to help data scientists thrive almost 15 years after its creation. Because of this success, nowadays there are several open-source projects that claim to improve pandas in various ways, either by bringing it to a distributed computing setting (Dask), accelerating its performance with minimal changes (Modin), or offering slightly different API that solves some of its shortcomings (Polars). \n\nIn this talk we will go over some of the most widely used dataframe Python libraries beyond pandas, clarify the relationship between them, compare them in terms of project scope and proximity to the original pandas API, and offer advice on when to use each of them.\n\nIf you are a seasoned pandas user willing to explore alternatives, or a beginner user wondering what is all the fuzz about these new dataframe libraries, this talk is for you!\n\nGithub Repo: https://github.com/astrojuanlu/talk-dataframes\nSlide Deck: https://nbviewer.org/format/slides/github/astrojuanlu/talk-dataframes/blob/main/slides.ipynb\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2405, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/astrojuanlu/talk-dataframes", + "url": "https://github.com/astrojuanlu/talk-dataframes" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "https://nbviewer.org/format/slides/github/astrojuanlu/talk-dataframes/blob/main/slides.ipynb", + "url": "https://nbviewer.org/format/slides/github/astrojuanlu/talk-dataframes/blob/main/slides.ipynb" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/GvYeBHNGlvM/maxresdefault.jpg", + "title": "Juan Luis Cano Rodr\u00edguez - Beyond Pandas: The Great Python Dataframe Showdown | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=GvYeBHNGlvM" + } + ] +} diff --git a/pydata-london-2022/videos/julien-simon-machine-learning-2-0-with-hugging-face-pydata-london-2022.json b/pydata-london-2022/videos/julien-simon-machine-learning-2-0-with-hugging-face-pydata-london-2022.json new file mode 100644 index 000000000..ce9432b0b --- /dev/null +++ b/pydata-london-2022/videos/julien-simon-machine-learning-2-0-with-hugging-face-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Julien Simon Presents: \n\nMachine Learning 2.0 with Hugging Face\n\nIn this session, we\u2019ll introduce you to Transformer models and what business problems you can solve with them. Then, we\u2019ll show you how you can simplify and accelerate your machine learning projects end-to-end: experimenting, training, optimizing, and deploying. Along the way, we\u2019ll run some demos to keep things concrete and exciting!\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 2911, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/Hz4t_rAtdxg/maxresdefault.jpg", + "title": "Julien Simon - Machine Learning 2.0 With Hugging Face | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=Hz4t_rAtdxg" + } + ] +} diff --git a/pydata-london-2022/videos/kajanan-sangaralingam-and-anindya-datta-feature-engineering-made-simple-pydata-london-2022.json b/pydata-london-2022/videos/kajanan-sangaralingam-and-anindya-datta-feature-engineering-made-simple-pydata-london-2022.json new file mode 100644 index 000000000..37f7ca56f --- /dev/null +++ b/pydata-london-2022/videos/kajanan-sangaralingam-and-anindya-datta-feature-engineering-made-simple-pydata-london-2022.json @@ -0,0 +1,51 @@ +{ + "description": "Kajanan Sangaralingam and Anindya Datta present:\n\nFeature Engineering Made Simple\n\nOf all the choices made by data scientists in the course of building and operating models, feature engineering & selection is one of the most critical. Features have a substantive impact on a model\u2019s quality, including its predictive accuracy and resilience. Unfortunately, as most ML scientists and practitioners are aware, feature engineering is more art than science. It is ad-hoc, messy, error-prone and ends up consuming 70-80% of the time and effort when building models, often resulting in sub-optimal feature selection leading to low-quality models. In this tutorial, we will introduce new ways of performing feature engineering, turning it into a systematic, procedural and scalable process, which is substantively more efficient than how it occurs currently. Participants will perform a hands-on, end-to-end, feature building exercise, with particular emphasis on feature engineering using Anovos (https://anovos.ai/ or https://github.com/anovos/anovos)\n\nIn this tutorial, we will introduce new ways of performing feature engineering, turning it into a systematic, procedural and scalable process, which is substantively more efficient than how it occurs currently.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5258, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/anovos/anovos", + "url": "https://github.com/anovos/anovos" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "https://anovos.ai/", + "url": "https://anovos.ai/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/DvDxS4uKj5Q/maxresdefault.jpg", + "title": "Kajanan Sangaralingam and Anindya Datta - Feature Engineering Made Simple | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=DvDxS4uKj5Q" + } + ] +} diff --git a/pydata-london-2022/videos/kishan-manani-feature-engineering-for-time-series-forecasting-pydata-london-2022.json b/pydata-london-2022/videos/kishan-manani-feature-engineering-for-time-series-forecasting-pydata-london-2022.json new file mode 100644 index 000000000..cc89b20d9 --- /dev/null +++ b/pydata-london-2022/videos/kishan-manani-feature-engineering-for-time-series-forecasting-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Kishan Manani present:\n\nFeature Engineering for Time Series Forecasting\n\nTo use our favourite supervised learning models for time series forecasting we first have to convert time series data into a tabular dataset of features and a target variable. In this talk we\u2019ll discuss all the tips, tricks, and pitfalls in transforming time series data into tabular data for forecasting.\n\nGithub/Slides: https://github.com/KishManani/PyDataLondon2022\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2564, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/KishManani/PyDataLondon2022", + "url": "https://github.com/KishManani/PyDataLondon2022" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/9QtL7m3YS9I/maxresdefault.jpg", + "title": "Kishan Manani - Feature Engineering for Time Series Forecasting | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=9QtL7m3YS9I" + } + ] +} diff --git a/pydata-london-2022/videos/laszlo-sragner-clean-architecture-how-to-structure-your-ml-projects-to-reduce-technical-debt.json b/pydata-london-2022/videos/laszlo-sragner-clean-architecture-how-to-structure-your-ml-projects-to-reduce-technical-debt.json new file mode 100644 index 000000000..b6bca6189 --- /dev/null +++ b/pydata-london-2022/videos/laszlo-sragner-clean-architecture-how-to-structure-your-ml-projects-to-reduce-technical-debt.json @@ -0,0 +1,43 @@ +{ + "description": "Laszlo Sragner Presents:\n\nClean Architecture: How to Structure Your ML Projects to Reduce Technical Debt.\n\nSoftware engineering principles are frequently mentioned as a solution to data science's productivity problem. Unfortunately, rarely in a comprehensive format to be actionable or adopted for data-intensive use.\n\nIn this talk, I will present a framework that enables practitioners to structure their projects and manage changes throughout the product lifecycle at low effort.\n\nAudience will also learn about a minimum set of programming concepts to make this a reality.\n\nThe key takeaway for any Data Scientist is that you don't need to be a master programmer to start taking care of your own codebase.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2207, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/QXfsS-ZOeyA/maxresdefault.jpg", + "title": "Laszlo Sragner - Clean Architecture: How to Structure Your ML Projects to Reduce Technical Debt", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=QXfsS-ZOeyA" + } + ] +} diff --git a/pydata-london-2022/videos/lightning-talks-01-at-pydata-london-2022-pydata-london-2022.json b/pydata-london-2022/videos/lightning-talks-01-at-pydata-london-2022-pydata-london-2022.json new file mode 100644 index 000000000..35174aaa6 --- /dev/null +++ b/pydata-london-2022/videos/lightning-talks-01-at-pydata-london-2022-pydata-london-2022.json @@ -0,0 +1,39 @@ +{ + "description": "Lightning Talks at Day 2 PyData London 2022 \n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 4572, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/NtypFPaMGD0/maxresdefault.jpg", + "title": "Lightning Talks (01) at PyData London 2022 | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=NtypFPaMGD0" + } + ] +} diff --git a/pydata-london-2022/videos/lightning-talks-02-closing-notes-pydata-london-2022.json b/pydata-london-2022/videos/lightning-talks-02-closing-notes-pydata-london-2022.json new file mode 100644 index 000000000..8ac88b40f --- /dev/null +++ b/pydata-london-2022/videos/lightning-talks-02-closing-notes-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "www.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 3748, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/IEgDpCz4BJI/maxresdefault.jpg", + "title": "Lightning Talks (02) & Closing Notes | Pydata London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=IEgDpCz4BJI" + } + ] +} diff --git a/pydata-london-2022/videos/marysia-winkels-models-schm-odels-why-you-should-care-about-data-centric-ai-pydata-london-2022.json b/pydata-london-2022/videos/marysia-winkels-models-schm-odels-why-you-should-care-about-data-centric-ai-pydata-london-2022.json new file mode 100644 index 000000000..4b8527fed --- /dev/null +++ b/pydata-london-2022/videos/marysia-winkels-models-schm-odels-why-you-should-care-about-data-centric-ai-pydata-london-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Marysia Winkels Presents: \n\nModels Schm-odels: Why You Should Care About Data-Centric AI\n\nData Centric AI is the term coined by AI pioneer Andrew Ng for the movement that argues we shift our focus towards iterating on our data instead of models to improve machine learning predictions. But isn't this what we have always done? Why is this trend relevant now? Has something really changed, and if so, how does that change your work as a data scientist?\n\nThis talk will feature anecdotes and real-world examples of 'model-itis' that serve as an argument for data-centric AI, our lessons learned from winning the Data Centric AI competition, and practical tips on how you can integrate data-centric principles in your daily work.\n\nSlides: https://marysia.nl/assets/data-centric-ai-pydata-london.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 2469, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + }, + { + "label": "https://marysia.nl/assets/data-centric-ai-pydata-london.pdf", + "url": "https://marysia.nl/assets/data-centric-ai-pydata-london.pdf" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/vgtdPwUrP5I/maxresdefault.jpg", + "title": "Marysia Winkels - Models Schm-odels: Why you Should Care about Data-Centric AI | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=vgtdPwUrP5I" + } + ] +} diff --git a/pydata-london-2022/videos/natan-mish-data-validation-for-data-science-pydata-2022.json b/pydata-london-2022/videos/natan-mish-data-validation-for-data-science-pydata-2022.json new file mode 100644 index 000000000..247f6548a --- /dev/null +++ b/pydata-london-2022/videos/natan-mish-data-validation-for-data-science-pydata-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Natan Mish presents:\n\nData Validation for Data Science \n\nHave you ever worked really hard on choosing the best algorithm, tuned the parameters to perfection and built awesome feature engineering methods only to have everything break because of a null value? Then this tutorial is for you! Data validation is often neglected in the process of working on data science projects. In this tutorial, we will demonstrate the importance of implementing data validation for data science in commercial, open-source, and even hobby projects. We will then dive into some of the open-source tools available for validating data in Python and learn how to use them so that edge cases will never break our models. The open-source Python community will come to our help and we will explore wonderful packages such as Pydantic for defining data models, Pandera for complementing the use of Pandas, and Great Expectations for diving deep into the data. This tutorial will benefit anyone working on data projects in Python who want to learn about data validation. Some Python programming experience and understanding of data science are required. The examples used and the context of the discussion is around data science, but the knowledge can be implemented in any Python oriented project. \n\nGithub Repo: https://github.com/NatanMish/data_validation\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.\n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5043, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/NatanMish/data_validation", + "url": "https://github.com/NatanMish/data_validation" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/FuUE6an5bXI/maxresdefault.jpg", + "title": "Natan Mish - Data Validation for Data Science | PyData 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=FuUE6an5bXI" + } + ] +} diff --git a/pydata-london-2022/videos/nick-radcliffe-parallelism-the-old-way-using-mpi-in-python-with-mpi4py-pydata-london-2022.json b/pydata-london-2022/videos/nick-radcliffe-parallelism-the-old-way-using-mpi-in-python-with-mpi4py-pydata-london-2022.json new file mode 100644 index 000000000..3a9cdf350 --- /dev/null +++ b/pydata-london-2022/videos/nick-radcliffe-parallelism-the-old-way-using-mpi-in-python-with-mpi4py-pydata-london-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Nick Radcliffe presents:\n\nParallelism the Old Way: Using MPI in Python with MPI4Py\n\nMPI is one of the oldest best-established and best-tested approaches to parallel computing, with bindings for most languages and availability on most systems. MPI uses explicit message passing and can be used on \"shared-nothing\" systems (in which each process/processor has its own memory, unavailable to other processors) as well as shared-memory systems, (uniform and non-uniform).\n\nThis tutorial will provide a gentle introduction to parallel computing using specifically MPI using the Python MPI4Py library.\n\nPresentation Deck: https://pydata.org/london2022/wp-content/uploads/2022/06/mpi-pydata-london-2022.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5449, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/06/mpi-pydata-london-2022.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/06/mpi-pydata-london-2022.pdf" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/noMezSAT-w4/maxresdefault.jpg", + "title": "Nick Radcliffe - Parallelism the Old Way: Using MPI in Python with mpi4py | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=noMezSAT-w4" + } + ] +} diff --git a/pydata-london-2022/videos/nick-sorros-extreme-multilabel-classification-in-the-biomedical-nlp-domain-pydata-london-2022.json b/pydata-london-2022/videos/nick-sorros-extreme-multilabel-classification-in-the-biomedical-nlp-domain-pydata-london-2022.json new file mode 100644 index 000000000..aa75d1af0 --- /dev/null +++ b/pydata-london-2022/videos/nick-sorros-extreme-multilabel-classification-in-the-biomedical-nlp-domain-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Nick Sorros Presents:\n\nExtreme Multilabel Classification in the Biomedical NLP Domain\n\nExtreme multilabel classification refers to cases where the prediction space of a multilabel classifier is in the thousands of millions of labels which is an order of magnitude more than typical problems. The scale of such problems brings some unique challenges that one has to work around with such as memory, model size, train and inference time. This talk will discuss 1) how you can overcome those challenges, 2) relevant state of the art architectures for this problem 3) learning from the development of an transformers based nlp model to tag biomedical grants with 29K MeSH tags\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2045, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/NaBZksPgqR4/maxresdefault.jpg", + "title": "Nick Sorros - Extreme Multilabel Classification in the Biomedical NLP Domain | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=NaBZksPgqR4" + } + ] +} diff --git a/pydata-london-2022/videos/pedro-tabacof-unlocking-the-power-of-gradient-boosted-trees-using-lightgbm-pydata-london-2022.json b/pydata-london-2022/videos/pedro-tabacof-unlocking-the-power-of-gradient-boosted-trees-using-lightgbm-pydata-london-2022.json new file mode 100644 index 000000000..573c55dec --- /dev/null +++ b/pydata-london-2022/videos/pedro-tabacof-unlocking-the-power-of-gradient-boosted-trees-using-lightgbm-pydata-london-2022.json @@ -0,0 +1,51 @@ +{ + "description": "Pedro Tabacof Presents:\n\nUnlocking the Power of Gradient-Boosted Trees (using LightGBM)\n\nGradient-boosted trees (XGBoost, LightGBM, Catboost) have become the staple of machine learning for tabular datasets. While most data scientists have made use of them at some point, many don\u2019t know the true power those Python libraries provide. I will take LightGBM as an example and show in practice how it handles missing value imputation and categorical encoding natively, the different loss functions it provides for different problems (including the creation of your own loss function!), and how to interpret the resulting models. My aim is to show how LightGBM is like a Swiss army knife for machine learning and why it is the most pragmatic choice for tabular problems.\n\nGithub: https://github.com/catboost/catboost/blob/master/slides/2019_PyData_London/2019_PyData_London.pdf\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/PyData-London-2022_-Unlocking-the-power-of-LightGBM-summarized.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2237, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/PyData-London-2022_-Unlocking-the-power-of-LightGBM-summarized.pdf", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/PyData-London-2022_-Unlocking-the-power-of-LightGBM-summarized.pdf" + }, + { + "label": "https://github.com/catboost/catboost/blob/master/slides/2019_PyData_London/2019_PyData_London.pdf", + "url": "https://github.com/catboost/catboost/blob/master/slides/2019_PyData_London/2019_PyData_London.pdf" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/qGsHlvE8KZM/maxresdefault.jpg", + "title": "Pedro Tabacof - Unlocking the Power of Gradient-Boosted Trees (using LightGBM) | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=qGsHlvE8KZM" + } + ] +} diff --git a/pydata-london-2022/videos/richard-pelgrim-data-science-at-scale-with-dask-pydata-london-2022.json b/pydata-london-2022/videos/richard-pelgrim-data-science-at-scale-with-dask-pydata-london-2022.json new file mode 100644 index 000000000..f3e33a23f --- /dev/null +++ b/pydata-london-2022/videos/richard-pelgrim-data-science-at-scale-with-dask-pydata-london-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Richard Pelgrim presents:\n\nData Science at Scale with Dask\n\nThis tutorial is an introduction to Dask, an OSS Python library for distributed computing. We will walk through the many ways you can apply Dask to scale your Python code to work with larger datasets and/or transcend other compute-bound limitations. The tutorial assumes no prior knowledge of Dask.\n\nThis tutorial will cover\n\n- How to scale pandas with Dask\n- How to scale NumPy with Dask\n- How to parallelise your existing Python code with Dask\n- How to scale to the cloud with Dask and Coiled\n\nGithub Repo: https://github.com/rrpelgrim/dask-mini-tutorial\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 5049, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "https://github.com/rrpelgrim/dask-mini-tutorial", + "url": "https://github.com/rrpelgrim/dask-mini-tutorial" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/UnUPjjnkB3g/maxresdefault.jpg", + "title": "Richard Pelgrim - Data Science at Scale with Dask | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=UnUPjjnkB3g" + } + ] +} diff --git a/pydata-london-2022/videos/sam-morley-signature-methods-for-time-series-data-pydata-london-2022.json b/pydata-london-2022/videos/sam-morley-signature-methods-for-time-series-data-pydata-london-2022.json new file mode 100644 index 000000000..b47b19e0b --- /dev/null +++ b/pydata-london-2022/videos/sam-morley-signature-methods-for-time-series-data-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Sam Morley Presents:\n\nSignature Methods for Time Series Data \n\nSignatures are a mathematical tool that arise in the study of paths. Roughly speaking, they capture the fine structure of a path. It turns out that signatures are extremely useful for analysing time series data in a data science context. This is party because they can take irregularly sampled, highly oscillatory data and produce a single array of values of fixed size which can then be used as features in predictors etc. In this talk I will give a brief introduction to signatures and give a brief demonstration of how you can use them to analyse time series data. No mathematical background will be assumed.\n\nSlides: https://github.com/inakleinbottle/talks/blob/9e6cdcb74dae62767a851194530fca6bcbdb6aa6/signatures-methods-for-time-series-data.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.", + "duration": 2892, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/inakleinbottle/talks/blob/9e6cdcb74dae62767a851194530fca6bcbdb6aa6/signatures-methods-for-time-series-data.pdf", + "url": "https://github.com/inakleinbottle/talks/blob/9e6cdcb74dae62767a851194530fca6bcbdb6aa6/signatures-methods-for-time-series-data.pdf" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/pkZhtscaX1M/maxresdefault.jpg", + "title": "Sam Morley - Signature Methods for Time Series Data | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=pkZhtscaX1M" + } + ] +} diff --git a/pydata-london-2022/videos/sarah-diot-girard-off-with-their-i-os-or-how-to-contain-madness-by-isolating-your-code.json b/pydata-london-2022/videos/sarah-diot-girard-off-with-their-i-os-or-how-to-contain-madness-by-isolating-your-code.json new file mode 100644 index 000000000..76059baf3 --- /dev/null +++ b/pydata-london-2022/videos/sarah-diot-girard-off-with-their-i-os-or-how-to-contain-madness-by-isolating-your-code.json @@ -0,0 +1,47 @@ +{ + "description": "Sarah Diot-Girard Presents: \n\n\u201cOff with their I/Os!\u201d - Or How to Contain Madness by Isolating Your Code\n\nEngulfed in a tedious refactoring of your code, you\u2019re adding the 7th layer of mocks to a test when you realise something must have gone wrong somewhere. But what? You\u2019ve written readable code, split into functions and classes to avoid long chunks of code, and yet, every time, you end up with hardly testable code, a test suite that runs for hours, functions with seventeen arguments. You wonder if it\u2019s you mocking the code, or the code mocking you. \n\nFollow the white rabbit with Sarah Diot-Girard to learn about usual problems of code organization and I/O architecture and some tricks on how to handle I/Os and dependencies isolation. We might encounter a bit of SOLID advice, and maybe even a nice hat!\n\nSlide Deck: http://sdg.jlbl.net/slides/architecture-principles-for-datascientists/#/\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2154, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + }, + { + "label": "http://sdg.jlbl.net/slides/architecture-principles-for-datascientists/#/", + "url": "http://sdg.jlbl.net/slides/architecture-principles-for-datascientists/#/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/8sFG23k2FJ0/maxresdefault.jpg", + "title": "Sarah Diot-Girard - Off with their I/Os!\u2014Or How to Contain Madness by Isolating Your Code", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=8sFG23k2FJ0" + } + ] +} diff --git a/pydata-london-2022/videos/simon-ward-jones-introducing-more-of-the-standard-library-pydata-london-2022.json b/pydata-london-2022/videos/simon-ward-jones-introducing-more-of-the-standard-library-pydata-london-2022.json new file mode 100644 index 000000000..3824eb7f8 --- /dev/null +++ b/pydata-london-2022/videos/simon-ward-jones-introducing-more-of-the-standard-library-pydata-london-2022.json @@ -0,0 +1,47 @@ +{ + "description": "Simon Ward-Jones presents:\n\nIntroducing More of the Standard Library\n\nPython comes with many standard library packages included without any \"pip install\"! In this beginners tutorial we will go through a few of these with some interactive challenges during the session. Specifically we will dive into pathlib, datetime, collections, itertools and functools and how these can help you.\n\nGithub Repo: https://github.com/simonwardjones/pydata-talk-2022\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 4686, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/simonwardjones/pydata-talk-2022", + "url": "https://github.com/simonwardjones/pydata-talk-2022" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/ypApmOoCRSc/maxresdefault.jpg", + "title": "Simon Ward-Jones - Introducing More of the Standard Library | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=ypApmOoCRSc" + } + ] +} diff --git a/pydata-london-2022/videos/sylvain-corlay-possible-futures-for-jupyter-pydata-london-2022.json b/pydata-london-2022/videos/sylvain-corlay-possible-futures-for-jupyter-pydata-london-2022.json new file mode 100644 index 000000000..5ad9c227f --- /dev/null +++ b/pydata-london-2022/videos/sylvain-corlay-possible-futures-for-jupyter-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Sylvain Corlay Presents:\n\nPossible Futures for Jupyter\n\nJupyter has changed the way we think about interactive computing, scientific communication, and science education as it has been adopted globally, both in academia and industry.\n\nIn this talk, Corlay presents a bold vision for future applications and developments in the Jupyter ecosystem, some of which are just around the corner, while others are still mere possibilities, but have the potential for significant impact, for an even greater number of people. \n\nwww.pydata.org\n\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 1227, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/4041pEGsW6w/maxresdefault.jpg", + "title": "Sylvain Corlay - Possible Futures For Jupyter | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=4041pEGsW6w" + } + ] +} diff --git a/pydata-london-2022/videos/tambe-tabitha-a-how-pyodide-and-a-new-opensource-community-are-improving-childrens-social-work.json b/pydata-london-2022/videos/tambe-tabitha-a-how-pyodide-and-a-new-opensource-community-are-improving-childrens-social-work.json new file mode 100644 index 000000000..98dd43137 --- /dev/null +++ b/pydata-london-2022/videos/tambe-tabitha-a-how-pyodide-and-a-new-opensource-community-are-improving-childrens-social-work.json @@ -0,0 +1,47 @@ +{ + "description": "Tambe Tabitha Achere Presents:\n\nHow Pyodide and a new Opensource Community are Improving Children\u2019s Social Work\n\nSocial care workers support the most disadvantaged children in the UK and we help improve the sector with Data and Digital. Due to the extremely sensitive nature of the data in this context and long bureaucratic processes, data tools could neither be created to function on the internet nor could be installed by the users. This is a talk about how we coached social care workers to build a data cleaning tool and how Pyodide enabled it to scale. This talk is for people intrigued by complex problems. No previous knowledge is required.\n\nSlides: https://drive.google.com/file/d/1-2SuSCjTYaHhAMC1QLnQlMAc2lGWhKVn/view?usp=sharing\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2281, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://drive.google.com/file/d/1-2SuSCjTYaHhAMC1QLnQlMAc2lGWhKVn/view?usp=sharing", + "url": "https://drive.google.com/file/d/1-2SuSCjTYaHhAMC1QLnQlMAc2lGWhKVn/view?usp=sharing" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/g3vX4zKMmrs/maxresdefault.jpg", + "title": "Tambe Tabitha A. - How Pyodide and a New Opensource Community Are Improving Children\u2019s Social Work", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=g3vX4zKMmrs" + } + ] +} diff --git a/pydata-london-2022/videos/tania-allard-key-challenges-in-the-pydata-ecosystem-and-how-we-can-all-make-a-difference.json b/pydata-london-2022/videos/tania-allard-key-challenges-in-the-pydata-ecosystem-and-how-we-can-all-make-a-difference.json new file mode 100644 index 000000000..4a1fddb83 --- /dev/null +++ b/pydata-london-2022/videos/tania-allard-key-challenges-in-the-pydata-ecosystem-and-how-we-can-all-make-a-difference.json @@ -0,0 +1,43 @@ +{ + "description": "Tania Allard Presents:\n\nKey Challenges in the PyData Ecosystem and How We Can All Make a Difference\n\nThe PyData - and more broadly the scientific computing - ecosystem has seen massive growth both in adoption and complexity over the last few years, maybe decades. As for many other open-source ecosystems, this growth has also opened the door to complex socio-technical challenges. Many of which can directly impact the long-term sustainability of the ecosystem and its community.\n\nThis talk will dive into some of these current challenges and opportunities for us, the users, contributors, maintainers, activists, sponsors, and insert many other hats to help overcome those hurdles.\n\nAll while being intentional about the core tenents of collaboration, transparency, and openness that fuel our ecosystem.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 3212, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/curmiRtbciA/maxresdefault.jpg", + "title": "Tania Allard - Key Challenges In the PyData Ecosystem and How we Can All Make a Difference", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=curmiRtbciA" + } + ] +} diff --git a/pydata-london-2022/videos/thomas-wiecki-solving-real-world-business-problems-with-bayesian-modeling-pydata-london-2022.json b/pydata-london-2022/videos/thomas-wiecki-solving-real-world-business-problems-with-bayesian-modeling-pydata-london-2022.json new file mode 100644 index 000000000..ee238348c --- /dev/null +++ b/pydata-london-2022/videos/thomas-wiecki-solving-real-world-business-problems-with-bayesian-modeling-pydata-london-2022.json @@ -0,0 +1,39 @@ +{ + "description": "Thomas Wiecki Presents:\n\nSolving Real-World Business Problems with Bayesian Modeling\n\nAmong Bayesian early adopters, digital marketing is chief. While many industries are embracing Bayesian modeling as a tool to solve some of the most advanced data science problems, marketing is facing unique challenges for which this approach provides elegant solutions. Among these challenges are a decrease in quality data, driven by an increased demand for online privacy and the imminent \"death of the cookie\" which prohibits online tracking. In addition, as more companies are building internal data science teams, there is an increased demand for in-house solutions.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n0:05 Speaker introduction and PyMC 4 release announcement\n1:15 PyMC Labs- The Bayesian consultancy\n2:39 Why is marketing so eager to adopt Bayesian solutions\n3:49 Case Study: Estimating Marketing effectiveness\n6:00 Estimating Customer Acquisition Cost (CAC) using linear regression\n7:36 Drawbacks of linear regression in estimating CAC\n10:02 Blackbox Machine learning and its drawbacks\n11:27 Bayesian modelling\n11:52 Advantages of Bayesian modelling\n14:12 How does Bayesian modelling work?\n16:53 Solution proposals(priors)\n17:26 Model structure\n19:57 Evaluate solutions\n20:16 Plausible solutions(posterior)\n22:36 Improving the model\n23:38 Modelling multiple Marketing Channels\n24:51 Modelling channel similarities with hierarchy\n26:13 Allowing CAC to change over time\n28:00 Hierarchical Time Varying process\n30:05 Comparing Bayesian Media Mix Models\n30:47 What-If Scenario Forecasting\n31:53 Adding other data sources as a way to help improve or inform estimates\n33:00 When does Bayesian modelling work best?\n33:35 Intuitive Bayes course\n34:38 Question 1: Effectiveness of including variables seasonality?\n36:03 Question 2: What is your recommendation for the best way to choose priors?\n38:16 Question 3: How to test if an assumption about the data is valid?\n39:07 Question 4: Do you take the effect of different channels on each other into account?\n41:33 Thank you!", + "duration": 2504, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/twpZhNqVExc/maxresdefault.jpg", + "title": "Thomas Wiecki - Solving Real-World Business Problems with Bayesian Modeling | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=twpZhNqVExc" + } + ] +} diff --git a/pydata-london-2022/videos/thusal-fuzzy-matching-at-scale-pydata-london-2022.json b/pydata-london-2022/videos/thusal-fuzzy-matching-at-scale-pydata-london-2022.json new file mode 100644 index 000000000..ac19276f6 --- /dev/null +++ b/pydata-london-2022/videos/thusal-fuzzy-matching-at-scale-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Thusal Presents:\n\nFuzzy Matching at Scale\n\nFuzzy Matching is a useful tool that has been well discussed. However, these popular methods based on edit-distances like Levenshtein or Jaro-Winkler have failed to keep up with increasing data sizes. This talk will walk you through modern methods based on character-based n-grams, vector space models, and approximate nearest neighbours for Fuzzy Matching at Scale.\n\nThis talk provides a gentle and highly visual overview of some of the main intuitions and real-world applications of large language models. It assumes no prior knowledge of language processing and aims to bring attendees up to date with the fundamental intuitions and applications of large language models.\n\nwww.pydata.org\n\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 1776, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/jD1sA1D6ukM/maxresdefault.jpg", + "title": "Thusal - Fuzzy Matching at Scale | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=jD1sA1D6ukM" + } + ] +} diff --git a/pydata-london-2022/videos/usman-zafar-using-graph-neural-networks-to-embrace-the-dependency-within-your-data.json b/pydata-london-2022/videos/usman-zafar-using-graph-neural-networks-to-embrace-the-dependency-within-your-data.json new file mode 100644 index 000000000..80ecedfde --- /dev/null +++ b/pydata-london-2022/videos/usman-zafar-using-graph-neural-networks-to-embrace-the-dependency-within-your-data.json @@ -0,0 +1,47 @@ +{ + "description": "Usman Zafar Presents:\n\nUsing Graph Neural Networks to Embrace the Dependency Within Your Data\n\nMany machine learning models we use today have the core assumption that our data needs to be tabular, but how often is this truly the case? What if our data points are not independent? By ignoring the potential interrelatedness of our data, do we lose meaningful information that our models cannot leverage? In this talk, we shall explore graph neural networks and highlight how they can solve interesting problems in a way that is intractable when limiting ourselves to using tabular data.\n\nWe will look at the limitations of common algorithms and highlight how some clever linear algebra enables us to incorporate more meaningful information into our models. Social network data is a popular example of where relationships are relevant but relationships exist in many types of data where it may not be so obvious. Whether it's e-commerce, logistics or molecular data, relationships within your data likely exist and making use of them can be incredibly powerful.\n\nThis talk will hopefully spark your curiosity and provide you with a way of looking at problems from a new angle. It is intended for anyone with an interest in machine learning and will only lightly touch on some technical details\n\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/pres.pptx\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2335, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/pres.pptx", + "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/pres.pptx" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/I74zSp9udT8/maxresdefault.jpg", + "title": "Usman Zafar - Using Graph Neural Networks to Embrace the Dependency Within Your Data", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=I74zSp9udT8" + } + ] +} diff --git a/pydata-london-2022/videos/valerio-maggio-rethinking-data-visulisation-with-pyscript-pydata-london-2022.json b/pydata-london-2022/videos/valerio-maggio-rethinking-data-visulisation-with-pyscript-pydata-london-2022.json new file mode 100644 index 000000000..7b1e8ebfb --- /dev/null +++ b/pydata-london-2022/videos/valerio-maggio-rethinking-data-visulisation-with-pyscript-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Valerio Maggio Presents:\n\nRethinking Data Visualisation with PyScript\n\nPyScript leverages on the web browser to act as a ubiquitous virtual machine to deliver unprecedented Data Science use cases. Data Visualisation is the first and perhaps the most straightforward context in which PyScript can have its say. In this talk, we will present how PyScript can change the way data visualisation apps can be designed and delivered for complex data science use cases.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 2650, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/m6UNCJESYHM/maxresdefault.jpg", + "title": "Valerio Maggio - Rethinking Data Visulisation with PyScript | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=m6UNCJESYHM" + } + ] +} diff --git a/pydata-london-2022/videos/vincenzo-crescimanna-train-object-detection-with-small-datasets-pydata-london-2022.json b/pydata-london-2022/videos/vincenzo-crescimanna-train-object-detection-with-small-datasets-pydata-london-2022.json new file mode 100644 index 000000000..42dd14296 --- /dev/null +++ b/pydata-london-2022/videos/vincenzo-crescimanna-train-object-detection-with-small-datasets-pydata-london-2022.json @@ -0,0 +1,43 @@ +{ + "description": "Vincenzo Crescimanna presents:\n\nTrain Object Detection With Small Datasets\n\nObject detection, the task of localising and classifying objects in a scene, one of the most popular tasks in Computer Vision, has a main drawback: a large annotated dataset is necessary to train the model. Indeed, annotating a dataset is expensive, and the free available datasets are not enough, as they do not contain all the classes we are interested in. Thus, the goal of the tutorial is to introduce the main techniques to train a good object detector utilising the minimum amount of annotated data.\n\nwww.pydata.com \n\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...", + "duration": 4523, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVi...", + "url": "https://github.com/numfocus/YouTubeVi..." + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/YdPOZzlo2kI/maxresdefault.jpg", + "title": "Vincenzo Crescimanna - Train Object Detection with Small Datasets | PyData London 2022", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=YdPOZzlo2kI" + } + ] +} diff --git a/pydata-london-2022/videos/yizhar-izzy-toren-testing-testing-on-experimental-drift-and-data-driven-product-design.json b/pydata-london-2022/videos/yizhar-izzy-toren-testing-testing-on-experimental-drift-and-data-driven-product-design.json new file mode 100644 index 000000000..9f5eedf55 --- /dev/null +++ b/pydata-london-2022/videos/yizhar-izzy-toren-testing-testing-on-experimental-drift-and-data-driven-product-design.json @@ -0,0 +1,43 @@ +{ + "description": "Yizhar (Izzy) Toren presents:\n\nTesting, Testing: On Experimental Drift and Data Driven Product Design\n\nA/B testing is (and should be) the gold standard for making data driven decisions. However, basing your decisions solely on tests can lead to very bad product decisions, primarily because of different types of hard-to-track changes to your environment (aka \"experimental drift\"). In this talk I will explain what experimental drift is and how it can affect your product design and A/B testing choices. I will also review a few strategies of handling drift as a data scientist working in a product team and show examples.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps", + "duration": 2271, + "language": "eng", + "recorded": "2022-06-17", + "related_urls": [ + { + "label": "Conference Website", + "url": "https://pydata.org/london2022/" + }, + { + "label": "https://github.com/numfocus/YouTubeVideoTimestamps", + "url": "https://github.com/numfocus/YouTubeVideoTimestamps" + } + ], + "speakers": [ + "TODO" + ], + "tags": [ + "Education", + "Julia", + "NumFOCUS", + "Opensource", + "PyData", + "Python", + "Tutorial", + "coding", + "how to program", + "learn", + "learn to code", + "python 3", + "scientific programming", + "software" + ], + "thumbnail_url": "https://i.ytimg.com/vi/xbaI__8lZig/maxresdefault.jpg", + "title": "Yizhar (Izzy) Toren - Testing, Testing: On Experimental Drift and Data Driven Product Design", + "videos": [ + { + "type": "youtube", + "url": "https://www.youtube.com/watch?v=xbaI__8lZig" + } + ] +}