from Hacker News

Ask HN: What can I do with the source code of hundreds of enterprise projects?

by xauronx on 9/24/20, 4:05 PM with 0 comments

I'm in a somewhat unique situation of having access to about a decade of git repositories for enterprise projects. Nothing illicit - we're a consulting company and I can access this code. I have two questions:

1) What ways can this data be used to benefit my team or company?

2) Are there any tools on the market for evaluating huge amounts of source code?

I am the indirect manager of around 200 developers, and I'm an engineer myself, so my mind tends to go to improving development practices. Some things I've thought of:

1) Gather basic stats about the team for fun. (eg. How many LOC were written this week, how many LOC are we the stewards of, etc)

2) Look for patterns and identify opportunities for reusable modules. If 20% of our projects have a class called "PhoneNumberHelper" and no one has pulled it out into a reusable component... do that.

3) Look for missing platform features and sell that back to our vendor. If we show "of the last 200 projects we did, 50 of them (your biggest customers) had to custom build X". That seems like valuable information they may only gather through word of mouth right now.

4) Do some sort of organization wide code quality evaluation. Potentially (hopefully) show improvement over time as we've matured our practices.

It seems like a pretty crazy resource that we do nothing with as of now (other than on an project-by-project basis).

Note: I understand this is a security and IP minefield. I'm in a "theoretical" stage right now.