Published in Towards Data Science·Jan 17Member-onlyTips and Tricks for Working with Strings in PolarsFrom sorting column names to splitting columns — In my past articles on Polars (https://medium.com/search?q=wei-meng+lee+polars), …Polars9 min readPolars9 min read
Published in Towards Data Science·Jan 6Member-onlyFeature Engineering using Regular Expression (RegEx) in Pandas DataFrameDiscover how to manipulate your string columns easily using Regular Expressions — Manipulating string columns in Pandas is one of the most common operations that a data engineer will perform. Most of the time, you would do things like splitting columns, extracting key information from columns, etc. This task is commonly known as feature engineering. …Regex13 min readRegex13 min read
Published in Towards Data Science·Dec 30, 2022Member-onlyTips and Tricks for Loading Excel Spreadsheets into Pandas DataFramesLearn how to properly load your worksheets into Pandas DataFrames — In most data analytics projects, one of the most common and popular data file formats that you will usually encounter is CSV. However, people from the financial sector often deal with another format — Excel spreadsheets. While a lot of articles on Pandas DataFrame focus on loading using CSV files…Excel9 min readExcel9 min read
Published in Towards Data Science·Dec 22, 2022Member-onlyTips and Tricks for Loading Large CSV Files into Pandas DataFrames — Part 2Learn how to selectively load part of your CSV file into a DataFrame and at the same time reduce its memory footprint — In my previous article —Tips and Tricks for Loading Large CSV Files into Pandas DataFrames — Part 1 (https://towardsdatascience.com/tips-and-tricks-for-loading-large-csv-files-into-pandas-dataframes-part-1-fac6e351fe79) , I shared some techniques on how to load specific columns and rows from large CSV files into your Pandas DataFrames. In this article, I want to continue that narration, but…Memory Footprint17 min readMemory Footprint17 min read
Published in Towards Data Science·Dec 21, 2022Member-onlyTips and Tricks for Loading Large CSV Files into Pandas DataFrames — Part 1Learn how to read large CSV files to minimize memory usage as well as loading time — Most datasets you use in the real world are usually humongous, and they come in gigabytes and contain million of rows. For this article, I will discuss some techniques that you can employ when dealing with large CSV datasets. When dealing with large CSV files, there are two main concerns: …Csv8 min readCsv8 min read
Published in Level Up Coding·Dec 16, 2022Member-onlyInstalling Node.js using nvmLearn how to easily install different versions of Node.js using nvm — You might have heard of Node.js, an open-source, cross-platform runtime environment for developing server-side and networking applications. Node.js uses JavaScript and it supports platforms like macOS, Windows, Linux, and IBM AIX. Node.js …Nodejs3 min readNodejs3 min read
Published in Level Up Coding·Dec 8, 2022Member-onlyImporting CSV Into MySQL DatabasesLearn how to quickly import your CSV datasets into MySQL databases using MySQLWorkBench — As you embark on your Data Science journey, most of the datasets you encounter are usually stored in CSV files. While dealing with CSV files is pretty common in Pandas, sometimes you need to access your data in other format, such as tables in a database server such as MySQL…Csv4 min readCsv4 min read
Published in CryptoStars·Dec 7, 2022Member-onlyDeploying and Running Solidity Smart Contracts on ZilliqaLearn how to deploy and test your Solidity smart contracts on the Zilliqa testnet — In my earlier article on Zilliqa, I introduced the Scilla programming language for programming smart contracts on the Zilliqa blockchain: Writing Smart Contracts for the Zilliqa Blockchain using the Scilla Programming Language Learn Scilla by Exampleblog.cryptostars.is While it is not too difficult to learn Scilla, most smart contract developers already have a head start in another smart contract language — Solidity for the Ethereum blockchain. …Zilliqa4 min readZilliqa4 min read
Published in Towards Data Science·Dec 6, 2022Member-onlyDealing with Date and Time in Pandas DataFramesLearn how to manipulate date and time values in your Pandas DataFrames to make your life easier — One of the common tasks you often need to perform with Pandas DataFrames is that of manipulating date and time. Depending on how the date and time values are originally encoded in the dataset, you often have to spend considerable efforts in manipulating them so that you can use them…Pandas8 min readPandas8 min read
Published in Towards Data Science·Oct 28, 2022Member-onlyPerforming Data Analytics on the Flights Delay dataset using the Polars LibraryLearn how you can use Polars to perform various data analytics on the 2015 Flights Delay dataset — The 2015 Flights Delay dataset is a classic dataset used by learners of data analytics. It was published by the U.S. Department of Transportation’s (DOT) Bureau of Transportation Statistics. …Polars8 min readPolars8 min read