Selecting and removing columns from R dataframes

remove column in r
This is a topic that many people are looking for. is a channel providing useful information about learning, life, digital marketing and online courses …. it will help you have an overview and solid multi-faceted knowledge . Today, would like to introduce to you Selecting and removing columns from R dataframes. Following along are instructions in the video below:

hello and welcome to the next our labs tutorial video today were going to be talking about selecting and removing columns from our data frames to start with were going to consider the essentials of indexing columns for our data frames once youve mastered this well move on to see how we can select columns based on certain conditions we can do this by providing an index there is a logical vector also by then going beyond base or we can use the deployers package select and its helper functions that allow us to select column names based on if that character contains a certain character or whether it starts with or ends with certain characters okay so moving back to our studio lets open the package ggplot2 and assign to new data frame called my dot data the Diamonds data set that is within the ggplot2 package lets inspect this new my data data frame theres ten variables we can verify this by looking at the end calls and the names of these columns are as such weve got the number of carats of the diamond the cut of the diamond which is a measure of quality the color clarity etc etc when indexing data frames you provide the name of the data frame followed by square bracket science you then have the choice as to whether to provide one or two inputs for within the the square brackets in the cases whether were only one input is provided the default is to address columns in this case were considering the data frame my dot data and were addressing the first column of that data frame and assigning it to a new debt frame if we run the names function on this new data frame that weve just saved we can see that it returns only one column the first column so that seems to have worked we can of course go beyond looking at just one column at a time we can provide one input within the square brackets that considers a block of columns so this command reads for the my data let index columns one through three to give us added flexibility the one input that we provide to the data frame

within the square brackets can be a vector in this case were providing a vector that that is addressing the first column the fifth column sixth column seventh column eight columns night column and tenth column and the summary of the of this new data frame shows that its thumbs correct characters the first the first con depth is the fifth table is the sixth etc when putting values within the square brackets to index columns we can choose to include or even exclude columns by providing a negative sign as a prefix we can exclude columns this line would then read for my data lets include all columns except that of column one so it will return nine columns excluding the caret column so this column that occurred previously has now been removed an alternative way to do this is to state the data frames name and the element element of interest that you wanted to get rid of and assign this and null value and this will just just delete this column from the data frame lets rewrite the data again and start from a fresh Ive now described what happens if there was just one input provided to index within the square brackets but if theres two inputs provided the first refers to rows and the second refers to columns a blank value within square brackets means consider all values here therefore this line would read for the my dot data lets index all rows and all columns excluding the first one so in this case we started with ten columns we removed the first column and were left with ten nine columns now and weve got all the rows to give you another example of this ah here we have my data and were interested in the first one to 50 rows as its the first input within the square brackets and all columns except the first column so lets check if this is what correctly yes it has weve got the last nine columns and weve only now got 50 rows another way we can perform indexing is to provide a vector that has logical values in this case were calling the previous data frame which has nine rows we provide

it with a logical vector that has nine values and if the value is true well choose to include this in the next data frame and if its false will exclude us so here were left with just three variables because we chose to only include the first three columns this can provide us with quite an interesting trick that we can use as we can perform some conditional statements to derive a vector giving us true and false values based on a certain condition so for instance if we use the a supply function which looks at the list of names within my data ask it to perform that is dot a new map numeric function which returns a true or false value according to whether the column is numerical or otherwise we can give this vector a name in that in this case none and lets just inspect that indeed this returns as a vector that gives us true false values depending as to whether each column is numeric or not and then we can pass this vector and use it for indexing this is just one example of how you could use conditional statements to produce a logical vector and then provide this for indexing columns of a data frame if we want to go beyond the base or approach we can import the deep liar library and this labs allows us to do some other pretty interesting things with conditionally selecting columns because were using the d player package were going to start using the syntax of the infix operator which is pronounced then we approach indexing of dead frames now through a distinct function that weve inherited from the deep liar package so this line of code is equivalent to what you may have seen before where we index a few columns but its the deploy approach so its given the my data data frame then lets select these variables and I just want you to note that a difference between this and how we call these variables in name for indexing is in this case for the deep liar its not necessary for these calm titles to be given in quotation marks and yes if we run the names command

on this new data frame that weve created it returns four columns that we try to select excellent this line might seem a bit long and cumbersome but I hope to show you that by now moving to using select within the deploy package it opens up a whole host of new helper functions that we can nest within select to perform some pretty interesting manipulations and ways that we can choose columns here weve got a similar line of code to two above given my dot data data frame then lets select columns and lets select the ones that start with the letter C so in this case remember that those several columns that start with started with C lets look at the names that are associated with the columns of this output and Wow yes weve now selected four columns that carrots cut color and clarity Im not going to spend time in this video discussing the other helper functions but Ill just introduce them and quickly we have a contains helper function that says if a variable has r dot in it for instance then include that we have the ends with which is the opposite to the function that I just dont demonstrated so if it ends with a certain character like the length or maybe some unit measurement then include that and youve got the match if this exact thing occurs within a column that included in summary this tutorial has to talk to you through the basics of how to select columns also Ive demonstrated we can produce vectors lets assess a certain condition and then we can pass this logical vector into the data frame when performing indexing to allow us to dynamically choose the columns that we want and finally using the D player weve found some letters some interesting helper functions that might help us to automate and speed up selection of columns if weve got a lot to choose from thanks for listening to this our labs tutorial video if you found this useful its possible that you might be interested in our free online Moodle course or check out and subscribe to our YouTube channel with many interesting playlists about data handling statistics and modeling

Thank you for watching all the articles on the topic Selecting and removing columns from R dataframes. All shares of are very good. We hope you are satisfied with the article. For any questions, please leave a comment below. Hopefully you guys support our website even more.

Leave a Comment