Schlagwortarchiv für: Column


Pandas is a free and open-source Python library that provides fast, flexible, and expressive data structures that make working with scientific data easy.

Pandas is one of Python’s most valuable data analysis and manipulation packages.

It offers features such as custom data structures that are built on top of Python.

This article will discuss converting a column from one data type to an int type within a Pandas DataFrame.

Setting Up Pandas

Before diving into how to perform the conversion operation, we need to setup Pandas in our Python environment.

If you are using the almohadilla environment in the Anaconda interpreter, chances are you have Pandas installed.

However, on a native Python install, you will need to install it manually.

You can do that by running the command:

On Linux, run

$ sudo pip3 install pandas

In Anaconda or Miniconda environments, install pandas with conda.

$ conda install pandas
$ sudo conda install pandas

Pandas Create Sample DataFrame

Let us set up a sample DataFrame for illustration purposes in this tutorial. You can copy the code below or use your DataFrame.

import pandas as pd
df = pd.DataFrame({‘id’: [‘1’, ‘2’, ‘3’, ‘4’, ‘5’],
                   ‘name’: [‘Marja Jérôme’, ‘Alexios Shiva’, ‘Mohan Famke’, ‘Lovrenco Ilar’, ‘Steffen Angus’],
                   ‘points’: [‘50000’, ‘70899’, ‘70000’, ‘81000’, ‘110000’]})

Merienda the DataFrame is created, we can check the data.

Pandas Show Column Type

It is good to know if the existing type can be cast to an int before converting a column from one type to an int.

For example, attempting to convert a column containing names cannot be converted to an int.

We can view the type of a DataFrame using the dtypes property

Use the syntax:

In our sample DataFrame, we can get the column types as:

df.dtypes
id        object
name      object
points    object
dtype: object

We can see from the output above that none of the columns hold an int type.

Pandas Convert Column From String to Int.

To convert a single column to an int, we use the astype() function and pass the target data type as the parameter.

The function syntax:

DataFrame.astype(dtype, copy=True, errors=‘raise’)

  1. dtype – specifies the Python type or a NumPy dtype to which the object is converted.
  2. copy – allows you to return a copy of the object instead of acting in place.
  3. errors – specifies the action in case of error. By default, the function will raise the errors.

In our sample DataFrame, we can convert the id column to int type using the astype() function as shown in the code below:

df[‘id’] = df[‘id’].astype(int)

The code above specifies the ‘id’ column as the target object. We then pass an int as the type to the astype() function.

We can check the new data type for each column in the DataFrame:

df.dtypes
id         int32
name      object
points    object
dtype: object

The id column has been converted to an int while the rest remains unchanged.

Pandas Convert Multiple Columns to Int

The astype() function allows us to convert more than one column and convert them to a specific type.

For example, we can run the following code to convert the id and points columns to int type.

df[[‘id’, ‘points’]] = df[[‘id’, ‘points’]].astype(int)

Here, we are specifying multiple columns using the square bracket notation. This allows us to convert the columns to the data type specified in the astype() function.

If we check the column type, we should see an output:

df.dtypes
id         int32
name      object
points     int32
dtype: object

We can now see that the id and points column has been converted to int32 type.

Pandas Convert Multiple Columns to Multiple Types

The astype() function allows us to specify a column and target type as a dictionary.

Assume that we want to convert the id column to int32 and the points column to float64.

We can run the following code:

convert_to = {«id»: int, «points»: float}
df = df.astype(convert_to)

In the code above, we start by defining a dictionary holding the target column as the key and the target type as the value.

We then use the astype() function to convert the columns in the dictionary to the set types.

Checking the column types should return:

df.dtypes
id          int32
name       object
points    float64
dtype: object

Note that the id column is int32 and the points column is of float32 type.

Pandas Convert Column to Int – to_numeric()

Pandas also provides us with the to_numeric() function. This function allows us to convert a column to a numeric type.

The function syntax is as shown:

 pandas.to_numeric(arg, errors=‘raise’, downcast=None)

For example, to convert the id column to numeric in our sample DataFrame, we can run:

df[‘id’] = pd.to_numeric(df[‘id’])

The code should take the id column and convert it into an int type.

Pandas Convert DataFrame to Best Possible Data Type

The convert_dtypes() function in Pandas allows us to convert an entire DataFrame to the nearest possible type.

The function syntax is as shown:

DataFrame.convert_dtypes(infer_objects=True, convert_string=True, convert_integer=True, convert_boolean=True, convert_floating=True)

You can check the docs in the resource below:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html

For example, to convert our sample DataFrame to the nearest possible type, we can run:

If we check the type:

df.dtypes
id         Int32
name      string
points     Int64
dtype: object

You will notice that each column has been converted to the nearest appropriate type. For example, the function converts small ints to int32 type.

Likewise, the names column is converted to string type as it holds string values.

Finally, since the points column holds larger integers, it is converted to an int64 type.

Conclusion

In this article, we gave detailed methods and examples of converting a Pandas DataFrame from one type to another.



Source link


A variant of ALTER TABLE is supported by SQLite. A preexisting table can be modified, have a field retitled, have a field inserted into it, or have a field eliminated from it using SQLite’s ALTER TABLE statement. The table’s column name is renamed a new name with the RENAME COLUMN command. We can only modify a table inside the same dataset using this operation. Whereas if the table changes to include triggers or indices, these are kept just after the change. Therefore, we have decided to cover the topic of renaming columns of an SQLite table in this guide.

This article begins with the launch of the shell application first in Ubuntu 20.04 system. Using the Ctrl+Alt+T, we have opened it and started to update our system with the apt update instruction. After adding the password for the currently logged-in user, the updating process has started and been completed.

After updating the system, we have to ensure that our system and its internal packages are upgraded to the newest version. So, we have been upgrading its packages using the apt upgrade instruction, as displayed below:

After successfully updating and upgrading our system, we will be moving toward launching the SQLite database within the terminal shell. You have to utilize the single keyword “sqlite3” to start it. The SQLite shell will be launched on our screen, and we can use it for querying data.

After opening it, we have listed the tables of a database with the “.tables” instruction and found there are no tables so far in the database.

To rename a column, we must have a table in the database. Therefore, we have been making a table titled “Test” within our current SQLite database with the CREATE TABLE instruction. Within this table, we will have two columns ID and Name. The ID column will contain an integer type value representing the primary key of a table that must not be NULL. The Name column will be of Text type and must not be NULL as well. Now, we have a “test” table within the list of tables as per the “.tables” instruction. Selecting the records of a Test table, we have found that it is empty and needs some records to be inserted within it.

Therefore, we have inserted five records within the ID and Name column of a table “Test” using the INSERT INTO instruction using the VALUES keyword followed by the records to be inserted. The five records are unique and have no duplicate values. After inserting the records, we have been checking the table records with the help of a SELECT instruction followed by the asterisk “*” character and the table name “Test”. This query returns all the five records of this table for the ID and Name column separated by the “|” character.

Our column names are “ID” and “Name” for the Test table. Let’s start renaming the column names using the RENAME COLUMN instruction. We will be renaming the column “Name” to “Fname” using the ALTER TABLE instruction followed by the table name “Test” and the “RENAME COLUMN” using the “TO” keyword. The query was successful, as shown below:

sqlite> ALTER TABLE Test RENAME COLUMN Name TO Fname;

After altering the column’s name for the table “Test”, we will use the SELECT instruction to display all the table’s records. A total of five records have been displayed, as presented below:

sqlite> SELECT * FROM Test;

Let’s see how updating a new name works or not. Let’s insert the records within the Test table using the same llamativo names of columns for the table Test. Thus, we have tried the INSERT INTO instruction with the llamativo names of the “ID” and “Name” column followed by the VALUES keyword and the 6th record, i.e., (6, “Barak”). Execution of this instruction returns an error “table test has no column named “Name”. This error has occurred due to the usage of the llamativo column’s name “Name” instead of the new column name “Fname”.

Let’s insert the same record with the new column name “Fname” instead of the llamativo column name “Name” via the INSERT INTO instruction usage in the terminal. This time, we didn’t have any errors after executing this insertion command. We have displayed all the records of the Test table using the SELECT instruction followed by the asterisk “*” character. A total of six records have been displayed, i.e., the last record is the newest inserted record with the new column name “Fname”.

Just like the INSERT instruction, we can also use the SELECT instruction to fetch the records of a table and use the column name within it to display that the new name has been successfully added to the table column. So, we have been utilizing the SELECT instruction to display the Test table records while adding a WHERE clause condition specified. For this, we have been using the llamativo column’s name, “Name”, to display only the records from the table where the value in the Name column is “Ana”. Execution of this query displayed an error, “no such column: Name”. The reason for this error is the newly updated column’s name to “Fname”. Let’s run the same query with the new column name “Fname” to fetch all the records where the “Fname” column contains the value “Ana”. It displayed a single record from the table and removed the error.

sqlite> SELECT * FROM Actor WHERE Name = «Ana»;

sqlite> SELECT * FROM Actor WHERE FName = «Ana»;

Conclusion

This article discussed using the RENAME COLUMN clause within the ALTER TABLE instruction to update or modify the name of a specific column from the table. The example can be amended as well. We have done it so far in the simplest way possible and hope you like it.



Source link


By the end of this tutorial, you will understand how to use the astype() function in Pandas. This function allows you to cast an object to a specific data type.

Let us go exploring.

Function Syntax

The function syntax is as illustrated below:

DataFrame.astype(dtype, copy=True, errors=‘raise’)

The function parameters are as shown:

  1. dtype – specifies the target data type to which the Pandas object is cast. You can also provide a dictionary with the data type of each target column.
  2. copy ­– specifies if the operation is performed in-place, i.e., affects the innovador DataFrame or creating a copy.
  3. errors – sets the errors to either ‘raise’ or ‘ignore.’

Return Value

The function returns a DataFrame with the specified object converted to the target data type.

Example

Take a look at the example code shown below:

# import pandas
import pandas as pd
df = pd.DataFrame({
    ‘col1’: [10,20,30,40,50],
    ‘col2’: [60,70,80,90,100],
    ‘col3’: [110,120,130,140,150]},
    index=[1,2,3,4,5]
)
df

Convert Int to Float

To convert the ‘col1’ to floating-point values, we can do:

df.col1.astype(‘float64’, copy=True)

The code above should convert ‘col1’ to floats as shown in the output below:

Convert to Multiple Types

We can also convert multiple columns to different data types. For example, we convert ‘col1’ to float64 and ‘col2’ to string in the code below.

print(f«before: {df.dtypes}n«)
df = df.astype({
    ‘col1’: ‘float64’,
    ‘col2’: ‘string’
})
print(f«after: {df.dtypes}»)

In the code above, we pass the column and the target data type as a dictionary.

The resulting types are as shown:

Convert DataFrame to String

To convert the entire DataFrame to string type, we can do the following:

The above should cast the entire DataFrame into string types.

Conclusion

In this article, we covered how to convert a Pandas column from one data type to another. We also covered how to convert an entire DataFrame into string type.

Happy coding!!



Source link


For this one, we will explore how to get the data type of a specific column in a Pandas DataFrame.

Sample

Let us start by creating a sample DataFrame:

# import pandas
import pandas as pd
df = pd.DataFrame({
    ‘salary’: [120000, 100000, 90000, 110000, 120000, 100000, 56000],
    ‘department’: [‘game developer’, ‘database developer’, ‘front-end developer’, ‘full-stack developer’, ‘database developer’, ‘security researcher’, ‘cloud-engineer’],
    ‘rating’: [4.3, 4.4, 4.3, 3.3, 4.3, 5.0, 4.4]},
    index=[‘Alice’, ‘Michael’, ‘Joshua’, ‘Patricia’, ‘Peter’, ‘Jeff’, ‘Ruth’])
print(df)

The above should create a DataFrame with sample data as shown:

Pandas dtype Attribute

The most straightforward way to get the column’s data type in Pandas is to use the dtypes attribute.

The syntax is as shown:

The attribute returns each column and its corresponding data type.

An example is as shown:

The above should return the columns and their data types as shown:

salary          int64
department     object
rating        float64

If you want to get the data type of a specific column, you can pass the column name as an index as shown:

This should return the data type of the salary column as shown:

Pandas Column Info

Pandas also provide us with the info() method. It allows us to get detailed information about the columns within a Pandas DataFrame.

The syntax is as shown:

DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, show_counts=None, null_counts=None)

It allows you to fetch the name of the columns, data type, number of non-null elements, etc.

An example is as shown:

This should return:

The above shows detailed information about the columns in the DataFrame, including the data type.

Conclusion

This tutorial covers two methods you can use to fetch the data type of a column in a Pandas DataFrame.

Thanks for reading!!



Source link