gcue Posted June 22, 2016 Share Posted June 22, 2016 hello i am thinking of storing information on several thousands of files in a sqlite file. to help with duplicates, is there a way to sort the actual database file by (not results) but actual data in a specific column? this way i can search for a something and assume the next item is a duplicate and if its not then exitloop. (hope that makes sense) to show all results i was thinking of using a treeview and if there are duplicates then show them as subitems. please let me know any of your thoughts on the matter thanks! Link to comment Share on other sites More sharing options...
jchd Posted June 22, 2016 Share Posted June 22, 2016 Are you talking about text files or files with an homogenous structure? Because "sort by actual data" and "search for something" are pretty vague concepts. And how would you decide that two files are "duplicate"? This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
gcue Posted June 22, 2016 Author Share Posted June 22, 2016 files would be media files - so id store metadata for each file. i'd know each file was duplicate by its sha1 hash value - so id want to sort by the column storing those values Link to comment Share on other sites More sharing options...
Gianni Posted June 23, 2016 Share Posted June 23, 2016 Hi @gcue, in this thread, I've posted a possible way to go. Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
jchd Posted June 23, 2016 Share Posted June 23, 2016 To use @Chimp query, you'll need something to build the database. This is a skeleton you could use. Hasher.au3 is by trancexx in the example forum. expandcollapse popup#include <String.au3> #include <File.au3> #include <SQLite.au3> #include "..\Include\Autres\hashes.au3" Local $dll = _SQLite_Startup("c:\bin\sqlite3.dll") Local $hDB Local $DBname = "MyFiles.sq3" FileDelete($DBname) $hDB = _SQLite_Open($DBname) _SQLite_Exec($hDB, "pragma page_size = 64536;") _SQLite_Exec($hDB, "pragma cache_size = 10000;") _SQLite_Exec($hDB, "pragma journal_mode = WAL;") ConsoleWrite(_SQLite_LibVersion() & @LF) Local $aRows, $iCols, $iRows _SQLite_GetTable2d($hDB, "pragma compile_options", $aRows, $iRows, $iCols) _SQLite_Display2DResult($aRows) _SQLite_Exec($hDB, _ "create table if not exists Files (" & _ "Id integer primary key, " & _ "Title text collate nocase, " & _ "Album text collate nocase, " & _ "Hash text" & _ ");" _ ) Local $Time = TimerInit() Local $sDir = "C:\Users\jc\Music\CDs" Local $aFiles = _FileListToArrayRec($sDir, "*.flac", $FLTAR_FILES, $FLTAR_RECUR, $FLTAR_FASTSORT, $FLTAR_FULLPATH) If @error Then ConsoleWrite(@extended & @LF) Else $Time = TimerDiff($Time) Local $iFiles = $aFiles[0] ConsoleWrite("Enumerating " & $iFiles & " files in array took " & Round($Time / 1000, 2) & "s" & " (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF) $Time = TimerInit() Local $sDrive, $sFolder, $sName, $sExt, $sParent _SQLite_Exec($hDB, "begin immediate") For $i = 1 To $iFiles _PathSplit($aFiles[$i], $sDrive, $sFolder, $sName, $sExt) _SQLite_Exec($hDB, "insert into " & _ "Files (" & _ "Title, " & _ "Album, " & _ "Hash" & _ ") values (" & _ X($sName) & _ XX($sFolder) & _ XX(_SHA1ForFile($aFiles[$i])) & _ ")") Next _SQLite_Exec($hDB, "end") $Time = TimerDiff($Time) ConsoleWrite("Inserting " & $iFiles & " files in database took " & Round($Time / 1000, 2) & "s (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF) _SQLite_Exec($hDB, "create index if not exists ixAlbumTitle on Files (Album, Title)") EndIf _SQLite_Close($hDB) _SQLite_Shutdown() Func X($s) Return "'" & StringReplace($s, "'", "''") & "'" EndFunc Func XX($s) Return ", '" & StringReplace($s, "'", "''") & "'" EndFunc This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
gcue Posted June 23, 2016 Author Share Posted June 23, 2016 thank you for the help and feedback =) Link to comment Share on other sites More sharing options...
jchd Posted June 23, 2016 Share Posted June 23, 2016 (edited) Of course you'll have to adapt the code to your setup and needs. Once loaded, you can simply use SQLite Expert to query. With the DB created by the above script on my own PC I can now query: select * from files where title like '%love%' order by album, title and get fast answer. Adapt Chimp's query to list duplicates hashes. This is only the simplest setup, the schema can be extended to support much, much more. Edited June 23, 2016 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
gcue Posted June 23, 2016 Author Share Posted June 23, 2016 thanks again! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now