Home
Search results “Data mining with big data slideshare net”
Что такое Big Data и почему это страшно интересно - Андрей Себрант (01.02.2014)
 
01:37:54
[Slides] (http://www.slideshare.net/yandex/big-data-30799013?ref=http://habrahabr.ru/company/yandex/blog/214217/) [Ha Habre](http://habrahabr.ru/company/yandex/blog/214217/)
Views: 42235 Arthur Vard
BigData and Old Data: Embedding Predictive Analytics in Real Applications
 
01:10:06
Usama Fayyad, Chief Data Officer at Barclays Bank presents at RapidMiner World 2014 on the challenges of making the benefits of advanced analytics fit with the business or target area of application. Topics discussed include embedding data mining insights and models into production processes and live deployments, real-time data streaming and in situ data mining, BigData, unstructured data, and Hadoop. Access Usama's slides here: http://www.slideshare.net/RapidMiner/big-data-vs-classic-data-usama-fayyad
Views: 1436 RapidMiner, Inc.
Data Science - Part XI - Text Analytics
 
01:57:28
For downloadable versions of these lectures, please go to the following link: http://www.slideshare.net/DerekKane/presentations https://github.com/DerekKane/YouTube-Tutorials This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
Views: 16222 Derek Kane
Introduction to Big Data and Hadoop in Hindi
 
13:50
Introduction to Big Data and hadoop in Hindi Introduction to Big Data And Hadoop Introduction to big data BIg Data Sources Use Cases of Big Data Introduction to Hadoop Hadoop Components Hadoop Daemons Hadoop scale out storage Hadoop Cluster Link To English Video coming soon Link to PPT https://www.slideshare.net/SandeepPatil194/introduction-to-big-data-and-hadoop-81450933 FB page :- https://www.facebook.com/bitwsandeep/
AR and Big Data: Interoperable Data Repositories for Collaborative Work Environments (CWEs)
 
19:23
SlideShare: http://www.slideshare.net/AugmentedWorldExpo/ar-and-big-data-interoperable-data-repositories-for-collaborative-work-environments-cwes Jim Novack (Talent Swarm) Anand Gupta (Bigdatastrategy) Collaborative Work Environments (CWE) combined with Telepresence and Mixed Reality technologies offer new ways to improve the outcomes of engineering and building large construction, petrochemical, industrial, aeronautical and defense industry projects. This presentation will describe how in the near future, design, implementation and control processes in these projects will be performed more safely and accurately at lower cost by proposing a framework of existing, open and already adopted standards. Augmented World Expo (AWE) is back for its seventh year in our largest conference and expo featuring technologies giving us superpowers: augmented reality (AR), virtual reality (VR) and wearable tech. Join over 4,000 attendees from all over the world including a mix of CEOs, CTOs, designers, developers, creative agencies, futurists, analysts, investors, and top press in a fantastic opportunity to learn, inspire, partner, and experience first hand the most exciting industry of our times. See more at http://AugmentedWorldExpo.com
Big Data and predictive analysis: use case in the hotel industry
 
02:31
In order to improve its offer in a business strongy challenged by new players who offer new hosting modal. A hotel company intends to implement a Big Data solution that can predict hotel occupancy so that rates can be optimized according to demand. Discover how this hotel company has implemented a predictive analysis tool with no previous experience in Big Data thanks to Public Cloud and Orange Business Services experts. More about Orange Business Services: Official website: http://www.orange-business.com/en Facebook: https://www.facebook.com/orangebusiness/ Twitter: https://twitter.com/orangebusiness Linkedin: https://www.linkedin.com/company/oran... Slideshare: http://www.slideshare.net/orangebusiness Pinterest: https://fr.pinterest.com/orangebusiness
"Blockchain & Big Data", Trent MConaghy, Founder & CTO at ascribe GmbH
 
21:04
"Blockchain & Big Data", Trent MConaghy, Founder & CTO at ascribe GmbH Watch more from Data Natives 2015 here: http://bit.ly/1OVkK2J Visit the conference website to learn more: www.datanatives.io Follow Data Natives: https://www.facebook.com/DataNatives https://twitter.com/DataNativesConf Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2016: http://bit.ly/1WMJAqS Presentation Slides: https://www.slideshare.net/secret/8gK4MtBHS8AJyU About the author: Trent McConaghy is co-founder & CTO of ascribe, which uses blockchain technology and internet-scale machine learning to secure digital creations. Before that, he co-founded Solido Design Automation, which uses large-scale machine learning to help drive Moore's Law. Solido is now widely used in developing next-gen computer chips. Before that, he co-founded ADA, which used machine learning for analog synthesis. ADA was acquired in 2004. Trent has written two critically-acclaimed books on machine learning, creativity and circuit design, in addition to 50 papers+patents. He has given keynotes & invited talks at MIT, Columbia, Berkeley, JPL, Samsung, Qualcomm, Nvidia, Data Science Day, PyData, and more.
Views: 2048 Data Natives
Практическое применение data mining технологий / Александр Гринчук /  ИБМТ БГУ
 
48:30
Практическое применение data mining технологий / Александр Гринчук / ИБМТ БГУ Презентация: http://www.slideshare.net/WG_Talks/data-mining-40811073 Александр рассказал о рынке бизнес-аналитики в Беларуси и на примере реальных бизнес-задач показал проблемы, с которыми сталкиваются специалисты при внедрении Data Mining. DataTalks - неформальные встречи бизнес-аналитиков и специалистов в области анализа данных. Присоединяйтесь к нашей группе на LinkedIn: https://www.linkedin.com/groups?gid=6788018
Views: 2310 Wargaming CIS
A Look Under Progressive's Big Data Hood - Pawan Divakarla & Brian Durkin
 
32:08
Pawan Divakarla, Data and Analytics Business Leader at Progressive Casualty Insurance Company Brian Durkin, Innovation Enablement Services at Progressive H2O World 2015, Day 3 Contribute to H2O open source machine learning software https://github.com/h2oai Check out more slides on open source machine learning software at: http://www.slideshare.net/0xdata
Views: 1590 H2O.ai
Relationship Between Big Data and Artificial Intelligence
 
01:45
What is Big Data?? We already told about big data in our previous discussion. For your attention we want to tell that again. Big data equals to big deals. A lot of numbers of data may be structured or unstructured. For this time just keep this words in mind. Later of the discussion we will discuss on it briefly. What is AI?? Artificial Intelligence is the science of making computers do things that require intelligence when done by humans. All AI designs are at least somewhat inspired by the human brain. The objective of AI is to build intelligent agents. For example, consider video game characters. When a video game character goes from point A to point B, there is AI algorithm called path finding. Relationship Between AI & Big Data: Human intelligence builds up on what we read, observe, learn, sense and experience. It's our ability to store large amount of data, accumulated over years and co-relating a few data points to answer a certain question, that makes us intelligent. Similarly for machines to replicate human intelligence, they'll need to absorb large amount of data to make an intelligent decision. To download it as a PowerPoint Presentation please visit https://www.slideshare.net/marufrion/relationship-between-big-data-ai
Nachum Shacham of Paypal - R and ROI for Big Data
 
33:28
Nachum gives an overview of how Paypal measures the ROI of analytics and how H2O plays a critical role within their overall big data strategy. Don’t just consume, contribute your code and join the movement: https://github.com/h2oai User conference slides on open source machine learning software from H2O.ai at: http://www.slideshare.net/0xdata
Views: 528 H2O.ai
Stream Processing as Game Changer for Big Data and IoT by Kai Wähner
 
39:43
https://www.bigdataspain.org Abstract: https://www.bigdataspain.org/program/thu-stream-processing-game-changer-big-data-internet-things.html Slides: https://www.slideshare.net/secret/pZmmNP72lLoXgC Session presented at Big Data Spain 2016 Conference 17th Nov 2016 Kinépolis Madrid Event promoted by: http://www.paradigmadigital.com
Views: 313 Big Data Spain
A Tool For Big Data Analysis using Apache Spark
 
16:54
A Tool For Big Data Analysis using Apache Spark. Presented at Bangalore Apache Spark Meetup by Ganesha Yadiyala on 10/01/2016. http://www.meetup.com/Bangalore-Apache-Spark-Meetup/events/227472649/ For slides of this talk, refer http://www.slideshare.net/datamantra/a-tool-for-big-data-analysis-using-apache-spark Connect with Ganesha Yadiyala at http://www.datamantra.io https://www.linkedin.com/in/ganeshayadiyala https://twitter.com/ganeshayadiyala
Views: 499 datamantra
Big Data and Cyber Security, David Stubley (7Elements)
 
28:09
The slides are here: http://www.slideshare.net/WilliamBuchanan1/big-data-and-cyber-security
Views: 298 The Cyber Academy
Tom Kraljevic - Big Data Environments
 
37:13
Tom Kraljevic discusses big data environments with H2O on Hadoop, AWS, Apache Spark, and more. Don’t just consume, contribute your code and join the movement: https://github.com/h2oai User conference slides on open source machine learning software from H2O.ai at: http://www.slideshare.net/0xdata
Views: 814 H2O.ai
Mike Pittaro - High Performance Hardware for Data Analysis
 
36:10
View slides for presentation here: http://www.slideshare.net/PyData/mike-pittaro-high-performance-hardware-for-data-analysis PyData NYC 2014 Choosing hardware for big data analysis is difficult because of the many options and variables involved. The problem is more complicated when you need a full cluster for big data analytics. This session will cover the basic guidelines and architectural choices involved in choosing analytics hardware for Spark and Hadoop. I will cover processor core and memory ratios, disk subsystems, and network architecture. This is a practical advice oriented session, and will focus on performance and cost tradeoffs for many different options.
Views: 350 PyData
Ecommerce Analytics - Click Stream Data Analytics
 
02:11
Convert your abandoned shopping cart into sales. Identify segments and target your customers. Increase Conversion using click stream data analytics using predictive modelling techniques. Talk to us [email protected] Learn more about Happiest Minds Ecommerce Solution http://www.happiestminds.com/ecommerce-solutions/ Learn more about Happiest Minds Ecommerce Analytics http://www.happiestminds.com/ecommerce-analytics/ Related Links http://www.happiestminds.com/big-data-analytics/ Website http://www.happiestminds.com/ Have a question? Write to us http://www.happiestminds.com/write%20to%20us Connect with us on Facebook: https://www.facebook.com/happiestminds Twitter : http://twitter.com/#!/happiestminds LinkedIn : http://www.linkedin.com/company/happiest-minds-technologies Slideshare : http://www.slideshare.net/happiestminds Google + : https://plus.google.com/u/0/+happiestminds/posts
Views: 7965 Happiest Minds
The Big 6 Steps Of Big Data Explained [Audio]
 
08:21
The world population as on October 2017 was 7.6 Billion people. Which directly points to the fact that this is Big Data! All the insights you receive from running digital campaigns is your big data! What is Big Data? It is voluminous information or relevant statistics acquired by companies, firms and large organizations. Often this big data is difficult to compute manually. Read our blog to gain more insights on Big Data https://goo.gl/FZpXpB The Big 6 Steps 1. Data Mining 2. Data Collection 3. Data Storing 4. Data Cleaning 5. Data Analysis 6. Data Consumption With all your marketing efforts for your business in place, it's a good idea to have an all - in- one payment solution in place as well. SignUp on PayUmoney now to enjoy the best payment gateway experience and grow your business effortlessly. Know all the features and benefits of PayUmoney by watching this video https://goo.gl/i2wjT4 For all the latest updates, reach out to us at: Blog - https://goo.gl/57f9ea Facebook - https://www.facebook.com/PayUmoney/ Twitter - https://twitter.com/PayUmoney Slideshare - https://www.slideshare.net/PayUmoney_India
Views: 20 PayUmoney
Educate 2017: Mining for Gold: Using advanced analytics to get more value from your data
 
29:43
Slide deck available at: https://www.slideshare.net/learnosity/educate-2017-mining-for-gold-using-advanced-analytics-to-get-more-value-from-your-data Data mining is a multi-industry trend that looks certain to grow in both scope and application. It offers a hugely powerful means of identifying aggregates and trends at all levels, which is why we’ve worked extensively on deepening our analytics APIs over the last year. This session with Michael Sharman and Denis Hoctor will walk you through the latest advancements to our analytics and show you how we can make your data more valuable and how we handle the heavy lifting to make it easier for you to create complex reports at individual, class, school, or district level.
Views: 92 Learnosity
Building Scalable, Flexible Data Pipelines for Big Data, Vivek Ganesan 20140224
 
01:32:20
Speaker: Vivek Ganesan Data Science meeting (Formerly Data Mining) This presentation will be an overview of ETL (Extract, Transform, and Load) tasks and tools in Hadoop and will cover the pros/cons of different approaches. Speaker Bio Vivek has worked on big data and cloud deployments at large companies such as Intuit and Paypal, and also in startups. Currently he provides expert consulting services to Fortune 500 clients on Big Data projects. Slides: http://www.slideshare.net/vivekganesan/big-data-pipelines http://www.meetup.com/SF-Bay-ACM/events/160985942/
Views: 3411 San Francisco Bay ACM
Turning big data into big insight: New algorithms for big data analytics
 
23:12
Data scientists and professional analysts spend much of their time focusing on how to use the massive amounts of data at their fingertips to enhance decision making. The latest release of IBM SPSS Modeler includes algorithms specifically designed to handle more types and sources of data than ever before, helping users uncover insights quickly while using them to differentiate and grow their business. Learn how to do the following: • Identify the big data algorithms included in SPSS Modeler. • Build predictive models using SPSS Modeler’s big data algorithms. • Deploy predictive models to decision makers. • Explore predictive extensions as part of the SPSS predictive analytics community. Discover how SPSS Modeler can help your organization solve its big data challenges. Learn more about IBM SPSS: http://ibm.co/spsstrial Subscribe to the IBM Analytics Channel: https://www.youtube.com/subscription_center?add_user=ibmbigdata The world is becoming smarter every day, join the conversation on the IBM Big Data & Analytics Hub: http://www.ibmbigdatahub.com https://www.youtube.com/user/ibmbigdata https://www.facebook.com/IBManalytics https://www.twitter.com/IBMbigdata https://www.linkedin.com/company/ibm-big-data-&-analytics https://www.slideshare.net/IBMBDA
Views: 4422 IBM Analytics
Knowledge Graphs as a Data Platform - Data Architecture Summit 2017
 
53:01
Slides available here: https://www.slideshare.net/BenjaminNussbaum/knowledge-graphs-as-a-data-platform Big data has given rise to massive volumes of highly interconnected and increasingly complex information, coming from many sources. This introduces a host of implementation challenges that require knowledge in building intelligent systems. While many novel solutions exist to model and manage complex data – across the NoSQL and especially the Graph Database space – there are crucial limitations to these solutions. We discuss how to get the most out of complex, multi-sourced, heterogeneous data by showing how to model it expressively, migrate it efficiently, and query it intuitively; using knowledge graphs as a data platform for knowledge management. Learn how knowledge graphs can eliminate many of the challenges of working with complex data: traversing complex relationships, drawing crucial insight, and effectively analyzing data to fully harness its value. Help in managing your complex data through a reference architecture and connected data platform is available at www.graphgrid.com
Views: 2418 GraphGrid
Tomer Shiran: Self Service Data Exploration with Apache Drill
 
49:27
Video from http://www.meetup.com/Data-Mining/events/196951762/, including both Tomer's talk and Bruce's live Apache Drill demo. Tomer's slides are located at http://www.slideshare.net/MapRTechnologies/self-service-data-exploration-with-apache-drill .
Views: 878 SF Data Mining
Data Science - Part III -  EDA & Model Selection
 
01:48:37
For downloadable versions of these lectures, please go to the following link: http://www.slideshare.net/DerekKane/presentations https://github.com/DerekKane/YouTube-Tutorials This lecture introduces the concept of EDA, understanding, and working with data for machine learning and predictive analysis. The lecture is designed for anyone who wants to understand how to work with data and does not get into the mathematics. We will discuss how to utilize summary statistics, diagnostic plots, data transformations, variable selection techniques including principal component analysis, and finally get into the concept of model selection.
Views: 34505 Derek Kane
Data mining на практике. Подводные камни анализа данных / Ксения Петрова / COO dmlabs.org
 
33:50
Ксения, с точки зрения своего опыта, рассказала про главные грабли, на которые может наступить молодой аналитик. Data mining на практике. Подводные камни анализа данных / Ксения Петрова / COO dmlabs.org Презентация: http://www.slideshare.net/WG_Talks/data-mining-dmlabsorg DataTalks - неформальные встречи бизнес-аналитиков и специалистов в области анализа данных. Присоединяйтесь к нашей группе на LinkedIn: https://www.linkedin.com/groups?gid=6788018
Views: 7272 Wargaming CIS
Big Data in Cyber Security, Simon Arnell (HPE)
 
22:18
Slides: http://www.slideshare.net/WilliamBuchanan1/big-data-in-cyber-security
Views: 202 The Cyber Academy
KNIME Italy Meetup - Going Big Data on Apache Spark
 
37:22
Il talk che ho tenuto al KNIME Meetup di Milano ("KNIME Italy Meetup goes Big Data on Apache Spark"). Potete trovare le slide qui: http://www.slideshare.net/AndreBessi/knime-italy-meetup-going-big-data-on-apache-spark Apache Spark è un engine per l'elaborazione di dati su larga scala. Esso consente di costruire e testare modelli predittivi in ​​poco tempo, e contiene moduli per: SQL, Streaming, Machine Learning, Graph Processing. KNIME (Konstanz Information Miner) è una piattaforma open source di analisi dati e reporting che integra vari componenti per il machine learning e data mining. La sua interfaccia grafica consente la pre-elaborazione dei dati, la modellazione, l'analisi dei dati e la visualizzazione. KNIME Spark Executor è un insieme di nodi utilizzati per creare ed eseguire applicazioni su Apache Spark con la familiare piattaforma KNIME Analytics. In questo talk, approfondiremo l'architettura del KNIME Spark Executor, capiremo come KNIME interagisce con Spark, e vedremo i nuovi nodi sviluppati da Databiz.
Views: 321 Andrea Bessi
#HOTR TARGET, ADAPT, SELL, FORECAST, IT'S TIME TO MAKE BIG DATA TALK - HENRI VIERDIER
 
19:45
Slides: http://www.slideshare.net/secret/A6VmvuUMYs90LL HENRI VERDIER - Chief Data Officer of the French Government & Director @ Etalab Henri Verdier is a French entrepreneur, and currently the Head of Etalab, the French Agency for Public Open data. With Etalab, he launched the first government open data portal open to Citizen's contributions. Henri Verdier was CEO of MFG Labs, an internet startup involved in social data mining, and Chairman of the Board of Cap Digital, the French European Cluster for Digital Content and Services located in Paris Region. Entreprendre n’est pas inné et 80% des erreurs peuvent être évitées. Ne perds pas de temps, offre-toi Koudetat : http://bit.ly/koudetat-youtube
Views: 326 Startupfood
Data Mining, Лекция №1
 
01:21:00
Техносфера Mail.ru Group, МГУ им. М.В. Ломоносова. Курс "Алгоритмы интеллектуальной обработки больших объемов данных", Лекция №1 - "Задачи Data Mining" Лектор - Николай Анохин Обзор задач Data Mining. Стандартизация подхода к решению задач Data Mining. Процесс CRISP-DM. Виды данных. Кластеризация, классификация, регрессия. Понятие модели и алгоритма обучения. Слайды лекции: http://www.slideshare.net/Technosphere1/lecture-1-47107550 Другие лекции курса Data Mining | https://www.youtube.com/playlist?list=PLrCZzMib1e9pyyrqknouMZbIPf4l3CwUP Официальный сайт Технопарка | https://tech-mail.ru/ Официальный сайт Техносферы | https://sfera-mail.ru/ Технопарк в ВКонтакте | http://vk.com/tpmailru Техносфера в ВКонтакте | https://vk.com/tsmailru Блог на Хабре | http://habrahabr.ru/company/mailru/ #ТЕХНОПАРК #ТЕХНОСФЕРА x
Java as a fundamental working tool of the Data Scientist (Alexey Zinoviev, Joker, 2014)
 
55:10
Alexey Zinoviev presented this paper on the Jocker conference http://jokerconf.com/#zinoviev. Slides: http://www.slideshare.net/zaleslaw/java-for-data-scientist-zinoviev This paper covers next topics: Data Mining, Machine Learning, Mahout, Spark, MLlib, Python, Octave, R language
Views: 2340 Alexey Zinoviev
Data Science - Part I - Building Predictive Analytics Capabilities
 
01:52:19
For downloadable versions of these lectures, please go to the following link: http://www.slideshare.net/DerekKane/presentations https://github.com/DerekKane/YouTube-Tutorials This is the first video lecture in a series of data analytics topics and geared to individuals and business professionals who have no understand of building modern analytics approaches. This lecture provides an overview of the models and techniques we will address throughout the lecture series, we will discuss Business Intelligence topics, predictive analytics, and big data technologies. Finally, we will walk through a simple yet effective example which showcases the potential of predictive analytics in a business context.
Views: 158232 Derek Kane
Jon Sedar - Text mining to correct missing CRM information
 
12:30
http://www.slideshare.net/PyData/text-mining-to-correct-missing-crm-information-jonathan-sedar
Views: 364 PyData
Data Mining with WEKA and KNIME
 
01:46
This video is present the Data mining GUI tools and concept. WEKA is a collection of state-of-the-art machine learning algorithms and data preprocessing tools written in Java, developed at the University of Waikato, New Zealand. KNIME stands for Konstanz Information Miner It is an Open Source Data Analytics, Reporting and Integration platform you can view the presentation on https://www.slideshare.net/pgpm64/data-mining-gui-tools-with-demo
Java in production for Data Mining Research projects (JavaDayKiev'15)
 
51:01
Alexey Zinoviev presented this paper on the JavaDayKiev'15 conference Slides: http://www.slideshare.net/zaleslaw/javadaykiev15-java-in-production-for-data-mining-research-projects This paper covers next topics: Data Mining, Machine Learning, Hadoop, Spark, MLlib
Views: 309 Alexey Zinoviev
Hadoop. Введение в Big Data и MapReduce
 
02:01:20
Техносфера Mail.ru Group, МГУ им. М.В. Ломоносова. Курс "Методы распределенной обработки больших объемов данных в Hadoop" Лекция №1 "Введение в Big Data и MapReduce" Лектор - Алексей Романенко. Что такое «большие данные». История возникновения этого явления­. Необходимые знания и навыки для работы с большими данными. Что такое Hadoop, и где он применяется. Что такое «облачные вычисления», история возникновения и развития технологии. Web 2.0. Вычисление как услуга (utility computing). Виртуализация. Инфраструктура как сервис (IaaS). Вопросы параллелизма. Управление множеством воркеров. Дата-центры и масштабируемость. Типичные задачи Big Data. MapReduce: что это такое, примеры. Распределённая файловая система. Google File System. HDFS как клон GFS, его архитектура. Слайды лекции http://www.slideshare.net/Technopark/lecture-01-48215730 Другие лекции курса | https://www.youtube.com/playlist?list=PLrCZzMib1e9rPxMIgPri9YnOpvyDAL9HD Наш видеоканал | http://www.youtube.com/user/TPMGTU?sub_confirmation=1 Официальный сайт Технопарка | https://tech-mail.ru/ Официальный сайт Техносферы | https://sfera-mail.ru/ Технопарк в ВКонтакте | http://vk.com/tpmailru Техносфера в ВКонтакте | https://vk.com/tsmailru Блог на Хабре | http://habrahabr.ru/company/mailru/ #ТЕХНОПАРК #ТЕХНОСФЕРА x
Streaming Analytics presented by Big Data Developers at Streams Developer Day
 
45:20
Big Data Developers presents streaming analytics, a hands on tutorial broadcast live during Streams Developer Day. To find your local Big Data Developers meetup group and learn more about the applications of big data visit www.ibm.meetup.com. With smart phones and fully instrumented cars, the amount of data we can collect from moving objects is growing at staggering rates. In a few years, the automotive industry will be the largest data producer of data after utilities --bigger than health care. And with this Big Data volume comes Big Data challenges. The opportunity for applying all this data in real-time to problems in transportation, congestion management, emergency response, microweather prediction, supply chain management, and so on is tremendous. But this requires a real-time streaming analytics platform that can integrate GPS locations, telematics messages and sensor readings, video, and other kinds of information--and scale up to any level. Beyond automotive applications, the need for a real-time big-data streaming analytics platform is becoming critical in other industries, including Government, Telecommunications, Healthcare, Energy and Utilities, Finance, and Manufacturing. Join the Big Data Developers meetup and hear about future live streams: http://www.meetup.com/bigdatadevelopers Subscribe to the IBM Analytics Channel: https://www.youtube.com/subscription_center?add_user=ibmbigdata The world is becoming smarter everyday, join the conversation on the IBM Big Data & Analytics Hub: www.ibmbigdatahub.com www.youtube.com/user/ibmbigdata www.facebook.com/IBManalytics www.twitter.com/IBMAnalytics www.linkedin.com/company/ibm-big-data-&-analytics www.slideshare.net/IBMBDA
Views: 1345 IBM Analytics
Java in production for Data Mining Research projects (JET'15, Minsk)
 
57:22
Alexey Zinoviev presented this paper on the JET conference Slides: http://www.slideshare.net/zaleslaw/javadaykiev15-java-in-production-for-data-mining-research-projects This paper covers next topics: Data Mining, Machine Learning, Hadoop, Spark, MLlib
Views: 164 Alexey Zinoviev
Big Data algorithms and data structures for largescale graphs
 
38:55
Зиновьев Алексей Тамтэк www.dump-it.ru Исходная презентация: http://www.slideshare.net/it-people/big-data-algorithms-and-data-structures-for-large-scale-graphs-32925091
First Steps in Data Mining Kindergarten (Alexey Zinoviev, Omsk, 2014)
 
45:12
Alexey Zinoviev presented this paper on Second Thumbtack Technology Expert Day. This paper covers next topics: Data Mining, Machine Learning, Octave, R language Slides: http://www.slideshare.net/zaleslaw/first-steps-in-data-mining-kindergarten
Views: 163 Alexey Zinoviev
#BDAM: Kite SDK: Helping Hadoop projects work together - 06/23 Big Data Application Meetup, Talk #2
 
37:25
Speaker: Ryan Blue from Cloudera Big Data Applications Meetup, 06/23/2015 Palo Alto, CA More info here: http://www.meetup.com/BigDataApps/ Link to slides: http://www.slideshare.net/_blue/big-data-applications-meetup-cask About this talk: Big data applications on Hadoop commonly require several projects from the ecosystem working together on the same data. Interoperability between those projects remains a big challenge for developers because those projects all interact with datasets differently. When Spark writes files with an OutputFormat, how does Impala know how to read it? In this talk, Ryan Blue from Cloudera willl introduce Kite, a data-focused API for Hadoop, and talk about how we are using it to address the interoperability problem.
Views: 598 Cask
JOSA TechTalks - Real Time and Big Data
 
51:02
Although Hadoop might be the first thing to come in your mind when you think of processing large data sets, it is not always the best solution for your Big Data problems. Hadoop might be the right choice for batch-processing big data, but when it comes to real-time data processing there are other architectures and tools to consider. This TechTalk shows the need behind solving real time data problems and explains the Lambda architecture, covering Druid as an example, and the simpler and less expensive "relay model". By Mahmoud Jalajel - Head of Data Science Lab at Blue Kangaroo Presentation on: http://www.slideshare.net/jordanopensource/jalajel-tech-talk
Компромиссы в Data Science / Иван Мякишев и Алексей Юркевич / Wargaming
 
27:06
Компромиссы в Data Science / Иван Мякишев и Алексей Юркевич / Wargaming Презентация: http://www.slideshare.net/WG_Talks/data-mining-wargaming Каждый специалист сталкивается с этим в своей профессиональной жизни. В какой-то момент мы понимаем, что теоретические знания полученные во время обучения в ВУЗе или чтения специальной литературы не работают. Реальная жизнь – не примеры из курса по Machine Learning. Ребята рассказали об основных компромиссах, на которые приходится идти бизнесу и аналитикам. Это компромиссы скорости, точности и простоты. DataTalks - неформальные встречи бизнес-аналитиков и специалистов в области анализа данных. Присоединяйтесь к нашей группе на LinkedIn: https://www.linkedin.com/groups?gid=6788018
Views: 841 Wargaming CIS
Data Preparation vs. Data Wrangling Comparison in Machine Learning / Deep Learning
 
40:50
Data Preparation: Comparison of Programming Languages, Frameworks and Tools for Data Preprocessing and (Inline) Data Wrangling in Machine Learning / Deep Learning Projects. A key task to create appropriate analytic models in machine learning or deep learning is the integration and preparation of data sets from various sources like files, databases, big data storages, sensors or social networks. This step can take up to 80% of the whole project. This session compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing (like Talend, Pentaho), streaming analytics ingestion (like Apache Storm, Flink, Apex, TIBCO StreamBase, IBM Streams, Software AG Apama), and data wrangling (DataWrangler, Trifacta) within visual analytics. Various options and their trade-offs are shown in live demos using different advanced analytics technologies and open source frameworks such as R, Python, Apache Hadoop, Spark, KNIME or RapidMiner. The session also discusses how this is related to visual analytics tools (like TIBCO Spotfire), and best practices for how the data scientist and business user should work together to build good analytic models. Key takeaways for the audience: - Learn various options for preparing data sets to build analytic models - Understand the pros and cons and the targeted persona for each option - See different technologies and open source frameworks for data preparation - Understand the relation to visual analytics and streaming analytics, and how these concepts are actually leveraged to build the analytic model after data preparation Slide Deck: http://www.slideshare.net/KaiWaehner/data-preparation-vs-inline-data-wrangling-in-data-science-and-machine-learning
Views: 2174 Kai Wähner
Harvesting Facebook data from IBM SPSS Statistics
 
15:23
Learn how to build an extension in IBM SPSS Statistics that imports data from a public Facebook page. For more information, explore the blog post: http://ibm.co/1OEln0M Subscribe to the IBM Analytics Channel: https://www.youtube.com/subscription_center?add_user=ibmbigdata The world is becoming smarter every day, join the conversation on the IBM Big Data & Analytics Hub: http://www.ibmbigdatahub.com https://www.youtube.com/user/ibmbigdata https://www.facebook.com/IBManalytics https://www.twitter.com/IBMbigdata https://www.linkedin.com/company/ibm-big-data-&-analytics https://www.slideshare.net/IBMBDA
Views: 1956 IBM Analytics
[LIVE] Kamanja: A New Open Source Real-Time System for Scoring Data Mining Models, Greg Makowski,
 
54:23
[Streamed version. Front & back trimmed. Slide issue in beginning.] An edited version is available: https://www.youtube.com/watch?v=ANqB72b0r38 Slides: http://www.slideshare.net/gregmakowski/kamanja-driving-business-value-through-realtime-decisioning-solutions Greg Makowski, Director of Data Science, LigaDATA This talk will start with a number of complex data real-time use cases, such as a) complex event processing, b) supporting the modeling of a data mining department and c) developing enterprise applications on Apache big-data systems. While Hadoop and big data has been around for a while, banks and healthcare companies tend not to be early IT adopters. What are some of the security or roadblocks in Apache big data systems for such industries with high requirements? Data mining models can be trained in dozens of packages, but what can simplify the deployment of models regardless of where they were trained or with what algorithm? Predictive Modeling Markup Language (PMML), is a type of XML with specific support for 15 families of data mining algorithms. Data mining software such as R, KNIME, Knowledge Studio, SAS Enterprise Miner are PMML producers. The new open-source product, Kamanja, is the first open-source, real-time PMML consumer (scoring system). One advantage of PMML systems is that it can reduce time to deploy production models from 1-2 months to 1-2 days - a pain point that may be less obvious if your data mining exposure is competitions or MOOCs. Kamanja is free on Github, supports Kafka, MQ, Spark, HBase and Cassandra among other things. Being a new open-source product, initially, Kamanja supports rules, trees and regression. I will cover an architecture of a sample application using multiple real-time open source data, such as social network campaigns and tracking sentiment for the bank client and its competitors. Other real-time architectures cover credit card fraud detection. A brief demo will be given of the social network analysis application, with text mining. An overview of products in the space will include popular Apache big data systems, real-time systems and PMML systems. For more details: http://kamanja.org/ http://www.meetup.com/SF-Bay-ACM/events/223615901/ http://www.sfbayacm.org/event/kamanja-new-open-source-real-time-system-scoring-data-mining-models Venue sponsored by eBay, Food and live streaming sponsored by LigaDATA, San Jose, CA, July 27, 2015 Chapter Chair Bill Bruns Data Science SIG Program Chair Greg Makowski Vice Chair Ashish Antal Volunteer Coordinator Liana Ye Volunteers Joan Hoenow, Stephen McInerney, Derek Hao, Vinay Muttineni Camera Tom Moran Production Alex Sokolsky Copyright © 2015 ACM San Francisco Bay Area Professional Chapter
Views: 1151 San Francisco Bay ACM
Data Mining, Лекция №2
 
01:54:58
Техносфера Mail.ru Group, МГУ им. М.В. Ломоносова. Курс "Алгоритмы интеллектуальной обработки больших объемов данных", Лекция №2 "Задача кластеризации и ЕМ-алгоритм" Лектор - Николай Анохин Постановка задачи кластеризации. Функции расстояния. Критерии качества кластеризации. EM-алгоритм. K-means и модификации. Слайды лекции http://www.slideshare.net/Technosphere1/lecture-2-47107553 Другие лекции курса Data Mining | https://www.youtube.com/playlist?list=PLrCZzMib1e9pyyrqknouMZbIPf4l3CwUP Наш видеоканал | http://www.youtube.com/user/TPMGTU?sub_confirmation=1 Официальный сайт Технопарка | https://tech-mail.ru/ Официальный сайт Техносферы | https://sfera-mail.ru/ Технопарк в ВКонтакте | http://vk.com/tpmailru Техносфера в ВКонтакте | https://vk.com/tsmailru Блог на Хабре | http://habrahabr.ru/company/mailru/ #ТЕХНОПАРК #ТЕХНОСФЕРА x
Big Data Presentation SER322 (High Res)
 
12:59
Proper High resolution version. References Marr, B. (2014a). Big data: The 5 vs everyone must know. Linkedin. https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know Marr, B. (2014b). What is big data? Linkedin. http://www.slideshare.net/BernardMarr/140228-big-data-slide-share Learning Tree International. (2014). What is big data and Hadoop? YouTube. https://www.youtube.com/watch?v=FHVuRxJpiwI ExplainingComputers, (2012). Explaining big data. YouTube. https://www.youtube.com/watch?v=7D1CQ_LOizA Global internet traffic to surpass one zettabyte in 2016. (2016). University of Florida. https://news.it.ufl.edu/general-news/global-internet-traffic-to-surpass-one-zettabyte-in-2016/ Rouse, M. (2000-2016). Definition exabyte (EB). TechTarget. http://searchstorage.techtarget.com/definition/exabyte Watson (computer). (2016). Wikimedia Foundation, Inc. https://en.wikipedia.org/wiki/Watson_(computer) Engadget. (2011). IBM's Watson supercomputer destroys humans in Jeopardy | Engadget. YouTube. https://www.youtube.com/watch?v=WFR3lOm_xhE knowlengr. (2013-2015). Another v: Making the case for big data veracity. Krypton Brothers LLC. http://kryptonbrothers.com/news/big-data-veracity/ Hill, K. (2012). How Target figured out a teen girl was pregnant before her father did. Forbes.com LLC. http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/2/#64ccb45171cc Smith, C. (2016). By the number; 200+ amazing Facebook statistics (January 2016). DMR. http://expandedramblings.com/index.php/by-the-numbers-17-amazing-facebook-stats/ Jesse Anderson. (2013). Learn MapReduce with playing cards. YouTube. https://www.youtube.com/watch?v=bcjSe0xCHbE
Views: 31 Tyler Cole
From data to AI with the Machine Learning Canvas by Louis Dorard
 
44:33
https://www.bigdataspain.org Abstract: https://www.bigdataspain.org/program/fri-from-data-to-ai-with-the-machine-learning-canvas.html Slides: https://www.slideshare.net/secret/ETf7l0mccVWV8y Session presented at Big Data Spain 2016 Conference 17th Nov 2016 Kinépolis Madrid Event promoted by: http://www.paradigmadigital.com
Views: 477 Big Data Spain
No coding required: Expand your data mining toolset with predictive extensions
 
47:22
Open-source technology is creating new opportunities to become data-driven. For years, users of IBM SPSS Modeler have incorporated open-source languages within predictive models to conduct analysis. Now, take this capability a step farther with the introduction of predictive extensions, which can be plugged into SPSS Modeler to allow professional analysts and business users to take advantage of additional functionality without having to write code. Find out how enhancements to SPSS Modeler will enable you to do the following: • Find predictive extensions with the help of the SPSS predictive analytics community. • Download, install and modify predictive extensions through the SPSS Modeler interface. • Build your own R- and Python-based predictive extensions using the Custom Dialog Builder. Learn how you can use SPSS predictive extensions to take advantage of open source technology in your data analysis. Learn more about IBM SPSS http://ibm.co/spsstrial Subscribe to the IBM Analytics Channel: https://www.youtube.com/subscription_center?add_user=ibmbigdata The world is becoming smarter every day, join the conversation on the IBM Big Data & Analytics Hub: http://www.ibmbigdatahub.com https://www.youtube.com/user/ibmbigdata https://www.facebook.com/IBManalytics https://www.twitter.com/IBMbigdata https://www.slideshare.net/IBMBDA
Views: 294 IBM Analytics
Cientista de Dados – Dominando o Big Data com Software Livre
 
51:47
Serão apresentados os conceitos gerais sobre Big Data, as características as atividades do profissional de Big Data ( Cientista de Dados ), como tornar-se um cientista de dados, as principais ferramentas de mercado, e como este profissional pode usar o potencial das ferramentas de software livre e software aberto para dominar esta área de atuação. Uma visão geral sobre Hadoop, Cassandra, MongoDB, noSQL, BI, Data Mining e Analitycs entre outros conceitos emergentes da área de governança de dados será repassada aos participantes. Slides em http://pt.slideshare.net/ambientelivre/cientista-de-dados-dominando-o-big-data-com-software-livre
Views: 8338 Ambiente Livre