What's new

Welcome to App4Day.com

Join us now to get access to all our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, and so, so much more. It's also quick and totally free, so what are you waiting for?

Problem Solving using PySpark - Regression & Classification

V

voska89

Moderator
Joined
Jul 7, 2023
Messages
42,387
Reaction score
0
Points
36
980655c7610494ef3ff414da7ccb942a.jpeg

Free Download Problem Solving using PySpark - Regression & Classification
Published 12/2023
Created by Sathish Jayaraman
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English | Duration: 35 Lectures ( 1h 48m ) | Size: 1.1 GB​

Gradient Boosted Trees, XGBoost, Spark NLP, Prophet, Data Cleaning, Descriptive Statistics, Spark SQL
What you'll learn
Data analysis and descriptive statistics with PySpark - Learning to compute essential descriptive statistics for data understanding and summarization
Data Cleaning with PySpark
Predictive modeling with PySpark using Regression
Applying Classification techniques to a real world problem in PySpark
Text analytics using PySpark and Spark NLP
Time-Series modeling with PySpark and Prophet
Introduction to Spark SQL for data querying
Requirements
Basic knowledge of data science and ML principles will be helpful
Familiarity with Python to work with PySpark
A computer with internet to access course material
Description
This course is based on real world problems in PySpark, surrounding Data Cleaning, Descriptive statistics, Classification and Regression Modeling. The first segment introduces descriptive statistics in PySpark and computing fundamental measures such as mean, standard deviation and generating an extended statistical summary. The second segment is based on cleaning the data in PySpark, working with null values, redundant data and imputing the null values.The third segment is about Predictive modeling with PySpark using Gradient Boosted Trees RegressionThe fourth and fifth segments are based on applying classification techniques in PySpark. The fourth Segment introduces the application of Spark XGB Classifier for a classification problem and the fifth segment is about using a deep learning model for text sentiment classification. The sixth segment is about time series analytics and modeling using PySpark and ProphetThe seventh segment introduces Spark SQL for data querying and analysis.These segments also include advanced visualization techniques through Seaborn and Plotly libraries including Box plots to understand the distribution of the data and assessment of outliers, Count plots to understand balance in the proportion of data, Bar chart to represent feature importance as part of the Gradient Boosted Trees Regression Model, Word Cloud for text analytics and analyzing time series data to extract seasonality and trend components. Each of these segments, has a Google Colab notebook included aligning with the lecture.
Who this course is for
This course is suited for anyone interested in the realm of analytics using PySpark - particularly useful for analysts and engineers interested in Big Data, someone with a basic knowledge of data science and ML principles
Homepage
Code:
https://www.udemy.com/course/problem-solving-using-pyspark-regression-classification/




Recommend Download Link Hight Speed | Please Say Thanks Keep Topic Live
No Password - Links are Interchangeable
 
Top Bottom