{# Blog title goes here #}

Git search and replace

Here is a technique for doing a search-and-replace inside a git repository. I find myself doing this a few times a year and it's rare enough that I alwasy forget the exact commands. So I thought I'd write it down here for posterity.

TL;DR

git grep -l SEARCH | xargs sed -i s/SEARCH/REPLACE/g

If you want more details, keep reading.

The problem

Say you're doing a refactor in your code, or you some major overhaul of your documentation, and you find yourself wanting to do a search and replace across all files in your repository.

Let's say for example that you've been hard at work developing a cool new web framework called jingo (it stands for "javascript inspired network graph object" in case you were wondering). One day you realize with horror that the name you've chosen is awfully similar to another popular framework. So just to be safe, you decide to rename your project and change its name to flosk ("forward logic object software kit"). Now it's time to update all your code and documentation!

My solution

I imagine that today's fancy text editors have built-in ways to do these kinds of refactors, but personally I don't use a fancy text editor and I think it's useful to know how to do this "by hand".

My solution is to break the problem into two parts. First, find all the occurences of the text I want to remove (jingo in my example from earlier), then replace that text with the new one (flosk).

Search with git grep ...

The classic tool for finding text in a series of files is grep, but in this case I will use git grep because it has some useful defaults when dealing with a git repository. In theory all the examples should work using plain grep but might require a bit of extra work.

git grep jingo

This will find all occurences of jingo in any file inside the git repository, and show (by default) the name of the file and the line that contains the text.

I my case I don't want to see the match, just the list of files. That's what the --files-with-match flag is for (-l for short):

git grep -l jingo

Replace with sed

The classic tool for doing search-and-replace inside a file is the unix utility sed. It's a powerful tool but it can also be a little tricky to use. It has its own mini language to write commands that modify the file you give it in different ways. It calls those commands "expressions", and the first letter of the expression tells you what kind of operation is done. In our case we will use the s command which stands for "substitute". I have also used the d command in the past for deleting specific lines.

 sed 's/jingo/flosk/g' README.txt

By default, sed will not modify any files you give it. Instead it applies the command you gave it to the file, and shows the modified version leaving the original file intact. For our use case we actually want to modfiy the file, so we need to give sed the --in-place option (-i for short):

 sed -i 's/jingo/flosk/g' README.txt

Putting it together with xargs

We're almost there! With git grep ... we can find the files, and with sed ... we can search/replace inside one file. All we need is to "glue" the two together and run the sed command on each file. One way I like to do it is using the xargs command:

git grep -l jingo | xargs sed -i 's/jingo/flosk/g'

Going a bit further

Dealing with files containing spaces in their names

Because of the way xargs works, the command I've shown will break if any of the matched files contains a space in their name. For example if the file is called READ ME.txt then xargs will tell sed to open two files: READ and ME.txt. One solution for this is to tell git grep to use a special character to separate the file names (instead of putting one file name per line), and to tell xargs to use that same character for splitting the list of files names back into individual ones before passing them to sed. The character used is called a "null byte" and is normally not allowed in file names.

The option is called --null for both git grep and xargs, but unfortunately its short version is not the same with both tools: git grep uses -z while xargs uses -0 (a zero).

Here's how our example looks like now:

git grep -lz jingo | xargs -0 sed -i 's/jingo/flosk/g'

Restricting the target files

Because we use git grep, we can use its full power when it comes to targetting specific files (what git calls "pathspecs"). For example let's say we only want to target python files:

git grep -l jingo -- '*.py' | xargs sed -i 's/jingo/flosk/g'

More complex search/replace with regular expressions

So far we've used fixed strings for both search and replace, but sometimes you need a bit of extra power. Luckily both git and sed have support for regular expressions: git has --extended-regexp while sed has --regexp-extended. It's a bit annoying that they use different names, but luckily the short name of the flag is the same for both: -E.

As an example, here is how you can replace all instances of both jingo and JINGO with flosk:

git grep -l -E '(jingo|JINGO)' | xargs sed -i -E 's/(jingo|JINGO)/flosk/g'

To use capture groups in the replacement sed command, we can use \1 (and \2, ... if there is more than one capture group):

git grep -l -E '(jingo|JINGO)' | xargs sed -i -E 's/(jingo|JINGO)/the framework formerly known as \1/g'

I hope that helps, happy refactoring!