Python Pandas Tutorial 15 | How to Identify and Drop Null Values | Handling Missing Values in Python

python null
This is a topic that many people are looking for. is a channel providing useful information about learning, life, digital marketing and online courses …. it will help you have an overview and solid multi-faceted knowledge . Today, would like to introduce to you Python Pandas Tutorial 15 | How to Identify and Drop Null Values | Handling Missing Values in Python. Following along are instructions in the video below:

hi there this video I will talk about how you can identify and drop null values null values are part of every real-time or the real world dataset and you need to know what needs to be done with this null values one of the ways is dropping the null values and thats what we will be covering in this video so first thing is how to identify and basically pinpoint where are the null values and then processing it from the perspective of dropping it so first of all lets import our library which is panda import pandas as PD and then lets get the data set which we are working the super store sales data set and we will see whether there are any null values or not so lets go ahead and execute this will take okay its done I thought a couple of seconds all right after that first thing is that when you are using orders or when once you have created the orders object for the sheet which contains orders information you can use the method is null by saying dot is and I just press the tab after writing is and you can see it is giving you is copy is in and is now so the method we are interested in is null so Ill press the tab again or the enter again once I execute this command what it does is it shows us the entire data frame with the true and false value so true indicates whether that particular cell or the intersection of row and column which is actually a cell is having a true value if it is true then the value is null in that cell if it is false that means value is not null but what I will do is I will just use hat because there are some 8300 odd rules in this data set so it will take couple of seconds but and large on the screen so Ill

just use head and now you can see on the first few of those the output of the is not function and as you can see pretty much everything is false and that is true because there is nothing null in these rows or columns and then it becomes really interesting to see then if we are not able to identify or there are no nulls on the first few rows or the last few rows because we can use head as well as tail then where are the null values so the interesting thing is using the function some what some tells us is basically by the code it takes a column on the column names in the one column and the sum of or the count of I would really say into the next column so how it does it is by counting true values because true is indicated by one in the back end or the back of the system so if we are executing this then you will see that where are we having in other so if I go down so we have the null values only in the product based margin now lets identify those roles where for product based margin which are null so for that what you need to do is you need to filter or just get the product based margin within the orders object so again you need to write orders and write the product page margin what I will do Ill just copy this and paste it over here all right once we are done and then write is now press Enter shows me an arrow at quickly check let me go down to see 22 all right here snort and index okay think what we have done is quickly Ill just show you his I would use put it over here so basically what we are saying is only shows only filter those rows where car is now earlier we were applying it on

our data frame instead of the column name lets try to execute this yeah now record it and if I go towards right here are all the null values only those rows which are having the null value so if I see over here there are 63 rows which are null and as you can see the product piece margin earlier it was 63 rows using this sum function what we got similar to is now who have not null as well which is the opposite of his node so for example if I say orders dot no not head and execute that it will show true for all those rows which are not nerve and Falls which is null so it depends on what kind of operation you want to catch or what kind of output you want to get and accordingly you can apply that operation whether is null is null will give these values as false and not null will get these values as true and then you can counter it with the help of the Sun function and get the desired output now lets see how you can draw null values so before dropping lets see what is the shape to identify number of rows and columns so we have eight three nine nine rows and twenty-one columns lets try to drop thee so orders dot drop and a and you need to specify argument if I show you the argument going out of it down the main argument is this how is equals to any and then access 0 indicates zeros if you in change the 0 to 1 then it is it will be changed to columns and then there is a threshold subset and in play by default in place is false so that whatever we will do it over here will not impact our original data frame so if you want that it should in fact then make it true all right first of all lets experiment with

this how is equals to any and then see the shape so those 6003 removed and now the count is 8 3 6 6 comma 21 apart from any the another parameter that you get is all as you can see it over here how is equals to any or all if all values are n n that means all values in the row are an a then remove t remove that row so copy paste and if you dont know how I brought that parameter section I have just pressed shift and 2 times tab tab them and here well get the information all right and if I execute this you get eight three nine nine 21 which is similar to our original that means none of the rows have null values in the entire row so thats why you have got the output eight three nine nine that means nothing have been dropped after this you can explore its another parameter which is subset sometimes very helpful so what is happening over here is that it is looking at all the columns and checking whether in that particular row all the values in for the respective columns are null or not if you want you can specify a subset so what you can do is orders dont drop any and you subset and subset is equals to you need to specify the column name within the brackets and then say how is equals to all and shape well sometimes its a little difficult to go up and down and see the column names so what I usually do is I just command it up and use the method columns and it gives me the column name now I can choose whatever column I want so maybe I want lets say customer name and province these are the two columns and its in the required format so I dont have to do a lot of typing over here and I just paste it over there

just comment this up and complement this and execute and now you can see that in subset when I have taken just the subset that means not the entire row but only for these two columns if it has the null values in that particular row then those rows will be dropped and since it does not have those value null values you are getting the entire data set after that you have another parameter for drop any which is threshold so if I show you threshold is equals to none and lets see the definition what is given over here if I go drown require that many non any values that means in in a normal English sentence if I tell you so threshold is equals to 3 will basically indicate that if any row has more than 3 values as a null values and remove those rules but if there are any two values which are null or three values in a particular row which are null and dont drop those rows because there may be a scenario that there are a lot of nulls and you just want to specify a threshold that beyond a certain threshold just drop all those rows because then it does not make any sense so for that you can specify how is equals to any threshold is equal to one one will indicate the if particular row is having one value as a null value in a particular cell then dont drop that void if it is two or three in that two or three values which are null then drop those values alright so I just go and shape and again we get this because there is just one null value which is in a parodic product base margin and since it is coming under this threshold thats why it has not dropped it so I hope you have found all of these parameters useful and I will meet you in a new video the new topic

python pandas tutorial, learn python tutorial, python pandas, pandas python, python data anlaysis, python data analysis tutorial, data analysis with python a…
Thank you for watching all the articles on the topic Python Pandas Tutorial 15 | How to Identify and Drop Null Values | Handling Missing Values in Python. All shares of are very good. We hope you are satisfied with the article. For any questions, please leave a comment below. Hopefully you guys support our website even more.

Leave a Comment