Dealing with File Sequences in Python
I have been parsing through the files of other people a lot lately, and finally took the time to make a little function to give me general information about a sequence of files. It uses regex to yank the numeric parts out of a filename, figure out the padding, and glob to tell you how many files in the sequence. Here’s the code and an example usage:
#returns [base name, padding, filetype, number of files, first file, last file] def getSeqInfo(file): dir = os.path.dirname(file) file = os.path.basename(file) segNum = re.findall(r'\d+', file)[-1] numPad = len(segNum) baseName = file.split(segNum)[0] fileType = file.split('.')[-1] globString = baseName for i in range(0,numPad): globString += '?' theGlob = glob.glob(dir+'\\'+globString+file.split(segNum)[1]) numFrames = len(theGlob) firstFrame = theGlob[0] lastFrame = theGlob[-1] return [baseName, numPad, fileType, numFrames, firstFrame, lastFrame] |
Here is an example of usage:
print getSeqInfo('E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000547.jpg') >>['Frame', 6, 'jpg', 994, 'E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000000.jpg', 'E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000993.jpg'] |
I know this is pretty simple, but I looked around a bit online and didn’t see anything readily available showing how to deal with different numbered file sets. I have needed something like this for a while that will work with anything from OBJs sent from external contractors, to images from After Effects…
Hi Chris, thanks for sharing ! Not sure you’ll be interested, but for the sake of it, as I’m also playing with a lot of files in Python, I wanted to expose how I’d have done it :
def getSeqInfo(fpath):
folder, fname = os.path.split(fpath)
match = re.compile(“^(.*?)(\d+)\.(.*?)$”).match(fname)
if match is None:
raise RuntimeError(“Unable to find sequence number”)
baseName, sequenceNum, fileType = match.groups()
numPad = len(sequenceNum)
seqPattern = re.compile(r”^%s%s.%s$” % (baseName, “\d”*numPad, fileType))
names = [ name for name in os.listdir(folder) if seqPattern.match(name) ]
if names == []:
raise RuntimeError(“No matching file found”)
names.sort()
numItems = len(names)
firstItem = names[0]
lastItem = names[-1]
return [baseName, numPad, fileType, numItems, firstItem, lastItem]
May not be perfect, but it’s only another version 🙂
Comment by rotoglup — 2010/04/21 @ 12:14 AM
Man, you’re awesome! I just bought my little regex cheat book, and this stuff is like chinese to me. You know of a site that has like regex tutorials based on solid scenarios? 😀
Comment by admin — 2010/04/25 @ 4:27 PM
It often looks like chinese to me too, when looked at some time after the regexp creation !! Sorry, I have no solid pointer to give… I only work through sweat, pain, and general regexp principles knowledge plus python syntax memento from python docs…. I limit myself to basic uses, as it may be painful to debug and re-read ! Hang on, you’ll make it 😉
Comment by rotoglup — 2010/05/03 @ 8:15 PM
V.cool little snippet, throwing this into the farm control here 🙂
Love Regex, got a couple of those generic cheat sheet pdfs. Takes a bit of getting into but once it’s in your script you wonder how you managed without it!
Best
TxRx
😉
Comment by TxRx — 2010/05/31 @ 12:58 PM
For me, this is very helpful book:
http://oreilly.com/catalog/9781565922570
Comment by Marin Petrov — 2010/07/27 @ 3:45 AM
hi, thanks, you just saved my time. plus very knowledgeable work.
few questions
1- what would be smart psudo code steps to find missing files from a seq?
2- is there any way to detech number of sequences in a folder. i mean, what if someone put 2 file sequences in a folder. nuke file open dialog shows this very nicely. i always wanted to find out and ask their dev person, how he solved this. plus i found that cgkit (python) has an interesting module named ‘sequence’
but thanks again.
Comment by lala — 2014/04/27 @ 7:33 PM
I have done a widget in pyside browsing file system , and displaying file sequence.
https://github.com/zebulon75018/FileSequenceWidget
may be it could be usefull for somebody.
Comment by zebulon — 2016/03/15 @ 2:38 PM
I’ve tried to use pyseq with directories containing many thousands of files, and I always get “Argument list too large.” errors. Have any of you encountered this error and found a work-around?
Comment by Len — 2016/04/30 @ 11:27 PM
I know this is an old post but Len’s comment wasn’t too long ago. Anyway in my industry I run into this problem all the time and PySeq never worked for more than a few hundred files. But where the average movie is over 120,000 frames, it was nearly impossible to use in any real-world applications.
I’m in development right now (it’s pretty close to release and quite usable as-is) on a tool that solves the efficiency problem and can handle millions of files in just a few minutes on a low-end mac mini. Feel free to give it a try and let me know if you like it and how it can be improved!
https://github.com/codycuellar/ultrasequence
Comment by Cody Cuellar — 2017/12/21 @ 12:42 AM
Awesome, thanks for sharing!
Comment by Chris — 2018/01/03 @ 12:01 PM