"You are not going to lose your job to an AI, but you are going to lose your job to someone who uses AI." –Jensen Huang, CEO, Nvidia
Excel is limited to handling approximately one million rows, which restricts its ability to combine and open large files effectively. Although Power BI Desktop offers basic filtering functions, processing the consolidated CSV file—merged using a PowerShell script—takes several hours, and opening the resultant file may exceed ten minutes. Additionally, extracting aggregated folder structure and size information remains a complex and non-trivial task.
Of course, the ultimate way is to feed all the 18M records to a SQL table and go from there. However, before committing to the “thermonuclear” option, I wanted to try my luck with Copilot.
Copilot in Excel works, for a single CSV, barely. In Excel I can save a CSV file as XLSX and enable Copilot’s “Advanced Analytics” feature. Then I can prompt Copilot something like:
"Create a list of all the folders in the dataset with full folder path, folder name, total items (include all items in all the subfolders underneath it), total items size (also include all items in all the subfolders."
Alas, Copilot could not finish its deep data analysis because “
all subfolder items and sizes could not be completed because it takes too long for such a large dataset
”
.
A dead end due to row limitations even on 1/182 of the total dataset!
Lately Microsoft released several free gen AI agents for Microsoft 365 Copilot licensed users. One of them is called Analyst
, which can analyze, visualize, and interpret complex data quickly. I have had some good luck with it with small amount of data. Maybe it can help me with dealing with large amounts of data.
Soon I ran into limitations as well:
At least it is better than processing those files one by one. I thought.
I tried several times, with different numbers of CSV files (each CSV isaround 10 MB in size). Sometimes it can get the desired results but most of the time its Python processes would error out and complain it cannot find the files I just uploaded.
Seems like another dead-end!
Suddenly I had an “Eureka
” moment: "why can’t I use Analyst’s generated Python code to process the data?" That way it wouldn’t run into the “missing files
” situation and won’t subject to Analyst’s limitations and internal Python errors (“bugs”, may I say?).
The only problem is I never used Python and have no idea how to run Python code on a PC!
With Copilot, I can quickly find answers and learn new tasks, thanks to my relatively strong background in C# and PowerShell, among other development tools.
Below are some screenshots of me learning to install Python on my laptop and set up its modules:
The next challenge is the execution of Microsoft 365 Copilot Analyst-generated Python code on a personal computer. Upon attempting to run the copied code locally, I immediately encountered Python runtime errors.
The good thing is I can ask Analyst to help me to figure out how to fix those issues.
Eventually, I have gotten a fully working code with some minor edition: skipping deep folder structure and adding “print”
statements to show progress. I got the desired results from all the 182 CSV files in less than 30 minutes!
The content on Invoke, LLC's blog is provided for general informational purposes only and focuses on IT-related topics, including technology trends, software, AI, cybersecurity, and industry commentary. While we strive to provide accurate, up-to-date, and high-quality information, we make no representations or warranties of any kind, express or implied, about the accuracy, reliability, completeness, or suitability of the information, tools, or resources shared on this blog. Any reliance on such content is at your own risk.
The opinions expressed in our blog posts are those of the authors and do not necessarily reflect the official views of Invoke, LLC. Our blog may include links to third-party websites, software, or services. We do not endorse or assume responsibility for the content, functionality, security, or practices of these third-party resources.
The information on this blog is not intended to replace professional IT, technical, AI, or cybersecurity advice. You should consult a qualified IT professional before implementing any solutions, configurations, or strategies discussed on this blog. Invoke, LLC and its affiliates are not liable for any loss or damage, including but not limited to data loss, system downtime, or security breaches, arising from the use of or reliance on this blog’s content.
We reserve the right to modify, update, or remove content on this blog at any time without prior notice.