diff --git a/Changelog.md b/Changelog.md index d346484..984804a 100644 --- a/Changelog.md +++ b/Changelog.md @@ -5,6 +5,7 @@ ### CLI - Providing full static rust binary with [Eyra](https://github.com/sunfishcode/eyra) - [#1102](https://github.com/qarmin/czkawka/pull/1102) +- Fixed duplicated `-c` argument, now saving as compact json is handled via `-C` - ???? ### Krokiet GUI - Initial release of new gui - [#1102](https://github.com/qarmin/czkawka/pull/1102) @@ -17,7 +18,7 @@ - Added bigger stack size by default(fixes stack overflow in some musl apps) - [#1102](https://github.com/qarmin/czkawka/pull/1102) - Added optional libraw dependency(better single-core performance and support more raw files) - [#1102](https://github.com/qarmin/czkawka/pull/1102) - Speedup checking for wildcards and fix invalid recognizing long excluded items - [#1152](https://github.com/qarmin/czkawka/pull/1152) -- Even 10x speedup when searching for empty folders - [#1152](https://github.com/qarmin/czkawka/pull/1152) +- Big speedup when searching for empty folders(especially with multithreading + cached FS schema) - [#1152](https://github.com/qarmin/czkawka/pull/1152) - Collecting files for scan can be a lot of faster due lazy file metadata gathering - [#1152](https://github.com/qarmin/czkawka/pull/1152) - Fixed recognizing not accessible folders as non-empty - [#1152](https://github.com/qarmin/czkawka/pull/1152) diff --git a/README.md b/README.md index b740b9a..783465b 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,8 @@ **Czkawka** (_tch•kav•ka_ (IPA: [ˈʧ̑kafka]), "hiccup" in Polish) is a simple, fast and free app to remove unnecessary files from your computer. +**Krokiet** ((IPA: [ˈkrɔcɛt]), "croquet" in Polish) same as above, but uses Slint frontend. + ## Features - Written in memory-safe Rust - Amazingly fast - due to using more or less advanced algorithms and multithreading @@ -9,7 +11,7 @@ - Multiplatform - works on Linux, Windows, macOS, FreeBSD and many more - Cache support - second and further scans should be much faster than the first one - CLI frontend - for easy automation -- GUI frontend - uses GTK 4 framework and looks similar to FSlint +- GUI frontend - uses GTK 4 or Slint frameworks - No spying - Czkawka does not have access to the Internet, nor does it collect any user information or statistics - Multilingual - support multiple languages like Polish, English or Italian - Multiple tools to use: @@ -36,9 +38,18 @@ Each tool uses different technologies, so you can find instructions for each of ## Benchmarks -Since Czkawka is written in Rust and it aims to be a faster alternative to FSlint or DupeGuru which are written in Python, we need to compare the speed of these tools. +Previous benchmark was done mostly with two python project - dupeguru and fslint. +Both were written in python so it was mostly obvious that Czkawka will be faster due using more low-level functions and faster language. -I tested it on a 256 GB SSD and an i7-4770 CPU. +I tried to use rmlint gui but it not even started on my computer, so instead I used Detwinner, fclones-gui and dupeguru. + +I tested it on a 1024 GB SSD(Sata 3) and an i7-4770 CPU(4/8HT), disk contains 1742102 files which took 850 GB +Minimum file size 64KB, with search in hidden folders without any excluded folders/files. + +Czkawka 7.0.0 +Detwinner 0.4.2 +Dupeguru 4.3.1 +Fclones-gui 0.2.0 I prepared a disk and performed a test without any folder exceptions and with disabled ignoring of hard links. The disk contained 363 215 files, took 221,8 GB and had 62093 duplicate files in 31790 groups which occupied 4,1 GB. @@ -83,38 +94,40 @@ Similar images which check 349 image files that occupied 1.7 GB | DupeGuru 4.1.1 (First Run) | 55s | | DupeGuru 4.1.1 (Second Run) | 1s | +Of course there are multiple tools that offer even better performance, but usually are only specialized in one simple area. + ## Comparison to other tools Bleachbit is a master at finding and removing temporary files, while Czkawka only finds the most basic ones. So these two apps shouldn't be compared directly or be considered as an alternative to one another. In this comparison remember, that even if app have same features they may work different(e.g. one app may have more options to choose than other). -| | Czkawka | Krokiet | FSlint | DupeGuru | Bleachbit | -|:------------------------:|:-----------:|:-----------:|:------:|:------------------:|:-----------:| -| Language | Rust | Rust | Python | Python/Obj-C | Python | -| Framework base language | C | Rust | C | C/C++/Obj-C/Swift | C | -| Framework | GTK 4 | Slint | PyGTK2 | Qt 5 (PyQt)/Cocoa | PyGTK3 | -| OS | Lin,Mac,Win | Lin,Mac,Win | Lin | Lin,Mac,Win | Lin,Mac,Win | -| Duplicate finder | ✔ | ✔ | ✔ | ✔ | | -| Empty files | ✔ | ✔ | ✔ | | | -| Empty folders | ✔ | ✔ | ✔ | | | -| Temporary files | ✔ | ✔ | ✔ | | ✔ | -| Big files | ✔ | ✔ | | | | -| Similar images | ✔ | ✔ | | ✔ | | -| Similar videos | ✔ | ✔ | | | | -| Music duplicates(tags) | ✔ | ✔ | | ✔ | | -| Invalid symlinks | ✔ | ✔ | ✔ | | | -| Broken files | ✔ | ✔ | | | | -| Names conflict | ✔ | ✔ | ✔ | | | -| Invalid names/extensions | ✔ | ✔ | ✔ | | | -| Installed packages | | | ✔ | | | -| Bad ID | | | ✔ | | | -| Non stripped binaries | | | ✔ | | | -| Redundant whitespace | | | ✔ | | | -| Overwriting files | | | ✔ | | ✔ | -| Multiple languages | ✔ | | ✔ | ✔ | ✔ | -| Cache support | ✔ | ✔ | | ✔ | | -| In active development | Yes | | No | Yes | Yes | +| | Czkawka | Krokiet | FSlint | DupeGuru | Bleachbit | +|:------------------------:|:-----------:|:-----------:|:------:|:-----------------:|:-----------:| +| Language | Rust | Rust | Python | Python/Obj-C | Python | +| Framework base language | C | Rust | C | C/C++/Obj-C/Swift | C | +| Framework | GTK 4 | Slint | PyGTK2 | Qt 5 (PyQt)/Cocoa | PyGTK3 | +| OS | Lin,Mac,Win | Lin,Mac,Win | Lin | Lin,Mac,Win | Lin,Mac,Win | +| Duplicate finder | ✔ | ✔ | ✔ | ✔ | | +| Empty files | ✔ | ✔ | ✔ | | | +| Empty folders | ✔ | ✔ | ✔ | | | +| Temporary files | ✔ | ✔ | ✔ | | ✔ | +| Big files | ✔ | ✔ | | | | +| Similar images | ✔ | ✔ | | ✔ | | +| Similar videos | ✔ | ✔ | | | | +| Music duplicates(tags) | ✔ | ✔ | | ✔ | | +| Invalid symlinks | ✔ | ✔ | ✔ | | | +| Broken files | ✔ | ✔ | | | | +| Names conflict | ✔ | ✔ | ✔ | | | +| Invalid names/extensions | ✔ | ✔ | ✔ | | | +| Installed packages | | | ✔ | | | +| Bad ID | | | ✔ | | | +| Non stripped binaries | | | ✔ | | | +| Redundant whitespace | | | ✔ | | | +| Overwriting files | | | ✔ | | ✔ | +| Multiple languages | ✔ | | ✔ | ✔ | ✔ | +| Cache support | ✔ | ✔ | | ✔ | | +| In active development | Yes | Yes | No | Yes | Yes | ## Other apps There are many similar applications to Czkawka on the Internet, which do some things better and some things worse: @@ -123,6 +136,7 @@ There are many similar applications to Czkawka on the Internet, which do some th - [FSlint](https://github.com/pixelb/fslint) - A little outdated, but still have some tools not available in Czkawka - [AntiDupl.NET](https://github.com/ermig1979/AntiDupl) - Shows a lot of metadata of compared images - [Video Duplicate Finder](https://github.com/0x90d/videoduplicatefinder) - Finds similar videos(surprising, isn't it), supports video thumbnails + ### CLI Due to limited time, the biggest emphasis is on the GUI version so if you are looking for really good and feature-packed console apps, then take a look at these: - [Fclones](https://github.com/pkolaczk/fclones) - One of the fastest tools to find duplicates; it is written also in Rust diff --git a/ci_tester/src/main.rs b/ci_tester/src/main.rs index dc05e66..44a3c11 100644 --- a/ci_tester/src/main.rs +++ b/ci_tester/src/main.rs @@ -319,12 +319,12 @@ fn collect_all_files_and_dirs(dir: &str) -> std::io::Result { let path = entry.path(); if path.is_dir() { - folders.insert(path.display().to_string()); - folders_to_check.push(path.display().to_string()); + folders.insert(path.to_string_lossy().to_string()); + folders_to_check.push(path.to_string_lossy().to_string()); } else if path.is_symlink() { - symlinks.insert(path.display().to_string()); + symlinks.insert(path.to_string_lossy().to_string()); } else if path.is_file() { - files.insert(path.display().to_string()); + files.insert(path.to_string_lossy().to_string()); } else { panic!("Unknown type of file {:?}", path); } diff --git a/czkawka_cli/src/commands.rs b/czkawka_cli/src/commands.rs index a884122..8ca8caa 100644 --- a/czkawka_cli/src/commands.rs +++ b/czkawka_cli/src/commands.rs @@ -675,7 +675,7 @@ pub struct FileToSave { #[derive(Debug, clap::Args)] pub struct JsonCompactFileToSave { - #[clap(short, long, value_name = "json-file-name", help = "Saves the results into the compact json file")] + #[clap(short = 'C', long, value_name = "json-file-name", help = "Saves the results into the compact json file")] pub compact_file_to_save: Option, } diff --git a/czkawka_core/src/bad_extensions.rs b/czkawka_core/src/bad_extensions.rs index 22a9e64..5d8bc11 100644 --- a/czkawka_core/src/bad_extensions.rs +++ b/czkawka_core/src/bad_extensions.rs @@ -426,7 +426,7 @@ impl PrintResults for BadExtensions { writeln!(writer, "Found {} files with invalid extension.\n", self.information.number_of_files_with_bad_extension)?; for file_entry in &self.bad_extensions_files { - writeln!(writer, "{} ----- {}", file_entry.path.display(), file_entry.proper_extensions)?; + writeln!(writer, "{:?} ----- {}", file_entry.path, file_entry.proper_extensions)?; } Ok(()) diff --git a/czkawka_core/src/big_file.rs b/czkawka_core/src/big_file.rs index 4e406ce..6ccf27a 100644 --- a/czkawka_core/src/big_file.rs +++ b/czkawka_core/src/big_file.rs @@ -13,8 +13,8 @@ use log::debug; use rayon::prelude::*; use serde::{Deserialize, Serialize}; -use crate::common::{check_folder_children, check_if_stop_received, prepare_thread_handler_common, send_info_and_wait_for_ending_all_threads, split_path}; -use crate::common_dir_traversal::{common_read_dir, get_lowercase_name, get_modified_time, CheckingMethod, ProgressData, ToolType}; +use crate::common::{check_folder_children, check_if_stop_received, prepare_thread_handler_common, send_info_and_wait_for_ending_all_threads, split_path_compare}; +use crate::common_dir_traversal::{common_read_dir, get_modified_time, CheckingMethod, ProgressData, ToolType}; use crate::common_tool::{CommonData, CommonToolData, DeleteMethod}; use crate::common_traits::{DebugPrint, PrintResults}; @@ -68,13 +68,9 @@ impl BigFile { #[fun_time(message = "look_for_big_files", level = "debug")] fn look_for_big_files(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&Sender>) -> bool { - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); let mut old_map: BTreeMap> = Default::default(); - // Add root folders for finding - for id in &self.common_data.directories.included_directories { - folders_to_check.push(id.clone()); - } + let mut folders_to_check: Vec = self.common_data.directories.included_directories.clone(); let (progress_thread_handle, progress_thread_run, atomic_counter, _check_was_stopped) = prepare_thread_handler_common(progress_sender, 0, 0, 0, CheckingMethod::None, self.common_data.tool_type); @@ -87,13 +83,13 @@ impl BigFile { } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; let mut fe_result = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { return (dir_result, warnings, fe_result); }; @@ -110,22 +106,22 @@ impl BigFile { check_folder_children( &mut dir_result, &mut warnings, - current_folder, + ¤t_folder, &entry_data, self.common_data.recursive_search, &self.common_data.directories, &self.common_data.excluded_items, ); } else if file_type.is_file() { - self.collect_file_entry(&atomic_counter, &entry_data, &mut fe_result, &mut warnings, current_folder); + self.collect_file_entry(&atomic_counter, &entry_data, &mut fe_result, &mut warnings, ¤t_folder); } } (dir_result, warnings, fe_result) }) .collect(); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result) in segments { @@ -155,12 +151,7 @@ impl BigFile { current_folder: &Path, ) { atomic_counter.fetch_add(1, Ordering::Relaxed); - - let Some(file_name_lowercase) = get_lowercase_name(entry_data, warnings) else { - return; - }; - - if !self.common_data.allowed_extensions.matches_filename(&file_name_lowercase) { + if !self.common_data.allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return; } @@ -178,9 +169,9 @@ impl BigFile { } let fe: FileEntry = FileEntry { - path: current_file_name.clone(), - size: metadata.len(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, + size: metadata.len(), }; fe_result.push((fe.size, fe)); @@ -198,10 +189,7 @@ impl BigFile { for (_size, mut vector) in iter { if self.information.number_of_real_files < self.number_of_files_to_check { if vector.len() > 1 { - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); } for file in vector { if self.information.number_of_real_files < self.number_of_files_to_check { @@ -222,7 +210,7 @@ impl BigFile { DeleteMethod::Delete => { for file_entry in &self.big_files { if fs::remove_file(&file_entry.path).is_err() { - self.common_data.text_messages.warnings.push(file_entry.path.display().to_string()); + self.common_data.text_messages.warnings.push(file_entry.path.to_string_lossy().to_string()); } } } @@ -271,7 +259,7 @@ impl PrintResults for BigFile { writeln!(writer, "{} the smallest files.\n\n", self.information.number_of_real_files)?; } for file_entry in &self.big_files { - writeln!(writer, "{} ({}) - {}", format_size(file_entry.size, BINARY), file_entry.size, file_entry.path.display())?; + writeln!(writer, "{} ({}) - {:?}", format_size(file_entry.size, BINARY), file_entry.size, file_entry.path)?; } } else { write!(writer, "Not found any files.").unwrap(); diff --git a/czkawka_core/src/broken_files.rs b/czkawka_core/src/broken_files.rs index ab57126..8ad76af 100644 --- a/czkawka_core/src/broken_files.rs +++ b/czkawka_core/src/broken_files.rs @@ -22,7 +22,7 @@ use crate::common::{ IMAGE_RS_BROKEN_FILES_EXTENSIONS, PDF_FILES_EXTENSIONS, ZIP_FILES_EXTENSIONS, }; use crate::common_cache::{get_broken_files_cache_file, load_cache_from_file_generalized_by_path, save_cache_to_file_generalized}; -use crate::common_dir_traversal::{common_read_dir, get_lowercase_name, get_modified_time, CheckingMethod, ProgressData, ToolType}; +use crate::common_dir_traversal::{common_read_dir, get_modified_time, CheckingMethod, ProgressData, ToolType}; use crate::common_tool::{CommonData, CommonToolData, DeleteMethod}; use crate::common_traits::*; @@ -108,12 +108,7 @@ impl BrokenFiles { #[fun_time(message = "check_files", level = "debug")] fn check_files(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&Sender>) -> bool { - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); - - // Add root folders for finding - for id in &self.common_data.directories.included_directories { - folders_to_check.push(id.clone()); - } + let mut folders_to_check: Vec = self.common_data.directories.included_directories.clone(); let (progress_thread_handle, progress_thread_run, atomic_counter, _check_was_stopped) = prepare_thread_handler_common(progress_sender, 0, 1, 0, CheckingMethod::None, self.common_data.tool_type); @@ -126,13 +121,13 @@ impl BrokenFiles { } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; let mut fe_result = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { return (dir_result, warnings, fe_result); }; @@ -149,14 +144,14 @@ impl BrokenFiles { check_folder_children( &mut dir_result, &mut warnings, - current_folder, + ¤t_folder, &entry_data, self.common_data.recursive_search, &self.common_data.directories, &self.common_data.excluded_items, ); } else if file_type.is_file() { - if let Some(file_entry) = self.get_file_entry(&atomic_counter, &entry_data, &mut warnings, current_folder) { + if let Some(file_entry) = self.get_file_entry(&atomic_counter, &entry_data, &mut warnings, ¤t_folder) { fe_result.push((file_entry.path.to_string_lossy().to_string(), file_entry)); } } @@ -166,8 +161,8 @@ impl BrokenFiles { .collect(); debug!("check_files - collected files"); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result) in segments { @@ -185,13 +180,11 @@ impl BrokenFiles { fn get_file_entry(&self, atomic_counter: &Arc, entry_data: &DirEntry, warnings: &mut Vec, current_folder: &Path) -> Option { atomic_counter.fetch_add(1, Ordering::Relaxed); - - let file_name_lowercase = get_lowercase_name(entry_data, warnings)?; - - if !self.common_data.allowed_extensions.matches_filename(&file_name_lowercase) { + if !self.common_data.allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return None; } + let file_name_lowercase = entry_data.file_name().to_string_lossy().to_lowercase(); let type_of_file = check_extension_availability(&file_name_lowercase); if !check_if_file_extension_is_allowed(&type_of_file, &self.checked_types) { @@ -208,8 +201,8 @@ impl BrokenFiles { }; let fe: FileEntry = FileEntry { - path: current_file_name.clone(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, size: metadata.len(), type_of_file, error_string: String::new(), @@ -331,7 +324,7 @@ impl BrokenFiles { for (name, file_entry) in files_to_check { if let Some(cached_file_entry) = loaded_hash_map.get(&name) { - records_already_cached.insert(name.clone(), cached_file_entry.clone()); + records_already_cached.insert(name, cached_file_entry.clone()); } else { non_cached_files_to_check.insert(name, file_entry); } @@ -417,7 +410,7 @@ impl BrokenFiles { DeleteMethod::Delete => { for file_entry in &self.broken_files { if fs::remove_file(&file_entry.path).is_err() { - self.common_data.text_messages.warnings.push(file_entry.path.display().to_string()); + self.common_data.text_messages.warnings.push(file_entry.path.to_string_lossy().to_string()); } } } @@ -472,7 +465,7 @@ impl PrintResults for BrokenFiles { if !self.broken_files.is_empty() { writeln!(writer, "Found {} broken files.", self.information.number_of_broken_files)?; for file_entry in &self.broken_files { - writeln!(writer, "{} - {}", file_entry.path.display(), file_entry.error_string)?; + writeln!(writer, "{:?} - {}", file_entry.path, file_entry.error_string)?; } } else { write!(writer, "Not found any broken files.")?; diff --git a/czkawka_core/src/common.rs b/czkawka_core/src/common.rs index 7ebf3b0..bf45b94 100644 --- a/czkawka_core/src/common.rs +++ b/czkawka_core/src/common.rs @@ -1,10 +1,11 @@ #![allow(unused_imports)] // I don't wanna fight with unused imports in this file, so simply ignore it to avoid too much complexity +use std::cmp::Ordering; use std::ffi::OsString; use std::fs::{DirEntry, File, OpenOptions}; use std::path::{Path, PathBuf}; -use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering}; -use std::sync::Arc; +use std::sync::atomic::{AtomicBool, AtomicUsize}; +use std::sync::{atomic, Arc}; use std::thread::{sleep, JoinHandle}; use std::time::{Duration, Instant, SystemTime}; use std::{fs, thread}; @@ -202,18 +203,18 @@ pub fn open_cache_folder(cache_file_name: &str, save_to_cache: bool, use_json: b if save_to_cache { if cache_dir.exists() { if !cache_dir.is_dir() { - warnings.push(format!("Config dir {} is a file!", cache_dir.display())); + warnings.push(format!("Config dir {cache_dir:?} is a file!")); return None; } } else if let Err(e) = fs::create_dir_all(&cache_dir) { - warnings.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e)); + warnings.push(format!("Cannot create config dir {cache_dir:?}, reason {e}")); return None; } file_handler_default = Some(match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) { Ok(t) => t, Err(e) => { - warnings.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e)); + warnings.push(format!("Cannot create or open cache file {cache_file:?}, reason {e}")); return None; } }); @@ -221,7 +222,7 @@ pub fn open_cache_folder(cache_file_name: &str, save_to_cache: bool, use_json: b file_handler_json = Some(match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file_json) { Ok(t) => t, Err(e) => { - warnings.push(format!("Cannot create or open cache file {}, reason {}", cache_file_json.display(), e)); + warnings.push(format!("Cannot create or open cache file {cache_file_json:?}, reason {e}")); return None; } }); @@ -233,7 +234,7 @@ pub fn open_cache_folder(cache_file_name: &str, save_to_cache: bool, use_json: b if use_json { file_handler_json = Some(OpenOptions::new().read(true).open(&cache_file_json).ok()?); } else { - // messages.push(format!("Cannot find or open cache file {}", cache_file.display())); // No error or warning + // messages.push(format!("Cannot find or open cache file {cache_file:?}")); // No error or warning return None; } } @@ -321,12 +322,31 @@ pub fn get_dynamic_image_from_raw_image(path: impl AsRef + std::fmt::Debug pub fn split_path(path: &Path) -> (String, String) { match (path.parent(), path.file_name()) { - (Some(dir), Some(file)) => (dir.display().to_string(), file.to_string_lossy().into_owned()), - (Some(dir), None) => (dir.display().to_string(), String::new()), + (Some(dir), Some(file)) => (dir.to_string_lossy().to_string(), file.to_string_lossy().into_owned()), + (Some(dir), None) => (dir.to_string_lossy().to_string(), String::new()), (None, _) => (String::new(), String::new()), } } +pub fn split_path_compare(path_a: &Path, path_b: &Path) -> Ordering { + let parent_dir_a = path_a.parent(); + let parent_dir_b = path_b.parent(); + if parent_dir_a.is_none() || parent_dir_b.is_none() { + let file_name_a = path_a.file_name(); + let file_name_b = path_b.file_name(); + if file_name_a.is_none() || file_name_b.is_none() { + return Ordering::Equal; + } + + return if file_name_a > file_name_b { Ordering::Greater } else { Ordering::Less }; + } + if parent_dir_a > parent_dir_b { + Ordering::Greater + } else { + Ordering::Less + } +} + pub fn create_crash_message(library_name: &str, file_path: &str, home_library_url: &str) -> String { format!("{library_name} library crashed when opening \"{file_path}\", please check if this is fixed with the latest version of {library_name} (e.g. with https://github.com/qarmin/crates_tester) and if it is not fixed, please report bug here - {home_library_url}") } @@ -569,14 +589,14 @@ pub fn prepare_thread_handler_common( checking_method, current_stage, max_stage, - entries_checked: atomic_counter.load(Ordering::Relaxed), + entries_checked: atomic_counter.load(atomic::Ordering::Relaxed), entries_to_check: max_value, tool_type, }) .unwrap(); time_since_last_send = SystemTime::now(); } - if !progress_thread_run.load(Ordering::Relaxed) { + if !progress_thread_run.load(atomic::Ordering::Relaxed) { break; } sleep(Duration::from_millis(LOOP_DURATION as u64)); @@ -600,7 +620,7 @@ pub fn check_if_stop_received(stop_receiver: Option<&crossbeam_channel::Receiver #[fun_time(message = "send_info_and_wait_for_ending_all_threads", level = "debug")] pub fn send_info_and_wait_for_ending_all_threads(progress_thread_run: &Arc, progress_thread_handle: JoinHandle<()>) { - progress_thread_run.store(false, Ordering::Relaxed); + progress_thread_run.store(false, atomic::Ordering::Relaxed); progress_thread_handle.join().unwrap(); } diff --git a/czkawka_core/src/common_cache.rs b/czkawka_core/src/common_cache.rs index a5f73fb..ac29129 100644 --- a/czkawka_core/src/common_cache.rs +++ b/czkawka_core/src/common_cache.rs @@ -55,25 +55,21 @@ where { let writer = BufWriter::new(file_handler.unwrap()); // Unwrap because cannot fail here if let Err(e) = bincode::serialize_into(writer, &hashmap_to_save) { - text_messages - .warnings - .push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e)); - debug!("Failed to save cache to file {:?}", cache_file); + text_messages.warnings.push(format!("Cannot write data to cache file {cache_file:?}, reason {e}")); + debug!("Failed to save cache to file {cache_file:?}"); return text_messages; } - debug!("Saved binary to file {:?}", cache_file); + debug!("Saved binary to file {cache_file:?}"); } if save_also_as_json { if let Some(file_handler_json) = file_handler_json { let writer = BufWriter::new(file_handler_json); if let Err(e) = serde_json::to_writer(writer, &hashmap_to_save) { - text_messages - .warnings - .push(format!("Cannot write data to cache file {}, reason {}", cache_file_json.display(), e)); - debug!("Failed to save cache to file {:?}", cache_file_json); + text_messages.warnings.push(format!("Cannot write data to cache file {cache_file_json:?}, reason {e}")); + debug!("Failed to save cache to file {cache_file_json:?}"); return text_messages; } - debug!("Saved json to file {:?}", cache_file_json); + debug!("Saved json to file {cache_file_json:?}"); } } @@ -182,10 +178,8 @@ where vec_loaded_entries = match bincode::deserialize_from(reader) { Ok(t) => t, Err(e) => { - text_messages - .warnings - .push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e)); - debug!("Failed to load cache from file {:?}", cache_file); + text_messages.warnings.push(format!("Failed to load data from cache file {cache_file:?}, reason {e}")); + debug!("Failed to load cache from file {cache_file:?}"); return (text_messages, None); } }; @@ -194,10 +188,8 @@ where vec_loaded_entries = match serde_json::from_reader(reader) { Ok(t) => t, Err(e) => { - text_messages - .warnings - .push(format!("Failed to load data from cache file {}, reason {}", cache_file_json.display(), e)); - debug!("Failed to load cache from file {:?}", cache_file); + text_messages.warnings.push(format!("Failed to load data from cache file {cache_file_json:?}, reason {e}")); + debug!("Failed to load cache from file {cache_file:?}"); return (text_messages, None); } }; diff --git a/czkawka_core/src/common_dir_traversal.rs b/czkawka_core/src/common_dir_traversal.rs index e877931..71d6046 100644 --- a/czkawka_core/src/common_dir_traversal.rs +++ b/czkawka_core/src/common_dir_traversal.rs @@ -108,7 +108,8 @@ pub(crate) enum FolderEmptiness { /// Struct assigned to each checked folder with parent path(used to ignore parent if children are not empty) and flag which shows if folder is empty #[derive(Clone, Debug)] pub struct FolderEntry { - pub(crate) parent_path: Option, + pub path: PathBuf, + pub(crate) parent_path: Option, // Usable only when finding pub(crate) is_empty: FolderEmptiness, pub modified_date: u64, @@ -316,7 +317,7 @@ pub enum DirTraversalResult { }, SuccessFolders { warnings: Vec, - folder_entries: BTreeMap, // Path, FolderEntry + folder_entries: HashMap, // Path, FolderEntry }, Stopped, } @@ -344,15 +345,15 @@ where let mut all_warnings = vec![]; let mut grouped_file_entries: BTreeMap> = BTreeMap::new(); - let mut folder_entries: HashMap = HashMap::new(); + let mut folder_entries: HashMap = HashMap::new(); // Add root folders into result (only for empty folder collection) - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); if self.collect == Collect::EmptyFolders { for dir in &self.root_dirs { folder_entries.insert( - dir.clone(), + dir.to_string_lossy().to_string(), FolderEntry { + path: dir.clone(), parent_path: None, is_empty: FolderEmptiness::Maybe, modified_date: 0, @@ -361,7 +362,7 @@ where } } // Add root folders for finding - folders_to_check.extend(self.root_dirs); + let mut folders_to_check: Vec = self.root_dirs.clone(); let (progress_thread_handle, progress_thread_run, atomic_counter, _check_was_stopped) = prepare_thread_handler_common(self.progress_sender, 0, self.max_stage, 0, self.checking_method, self.tool_type); @@ -385,7 +386,7 @@ where } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; @@ -393,27 +394,29 @@ where let mut set_as_not_empty_folder_list = vec![]; let mut folder_entries_list = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { - set_as_not_empty_folder_list.push(current_folder.clone()); + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { + if collect == Collect::EmptyFolders { + set_as_not_empty_folder_list.push(current_folder); + } return (dir_result, warnings, fe_result, set_as_not_empty_folder_list, folder_entries_list); }; let mut counter = 0; // Check every sub folder/file/link etc. 'dir: for entry in read_dir { - let Some(entry_data) = common_get_entry_data(&entry, &mut warnings, current_folder) else { + let Some(entry_data) = common_get_entry_data(&entry, &mut warnings, ¤t_folder) else { continue; }; let Ok(file_type) = entry_data.file_type() else { continue }; match (entry_type(file_type), collect) { (EntryType::Dir, Collect::Files | Collect::InvalidSymlinks) => { - process_dir_in_file_symlink_mode(recursive_search, current_folder, entry_data, &directories, &mut dir_result, &mut warnings, &excluded_items); + process_dir_in_file_symlink_mode(recursive_search, ¤t_folder, entry_data, &directories, &mut dir_result, &mut warnings, &excluded_items); } (EntryType::Dir, Collect::EmptyFolders) => { counter += 1; process_dir_in_dir_mode( - current_folder, + ¤t_folder, entry_data, &directories, &mut dir_result, @@ -430,7 +433,7 @@ where &mut warnings, &mut fe_result, &allowed_extensions, - current_folder, + ¤t_folder, &directories, &excluded_items, minimal_file_size, @@ -440,7 +443,7 @@ where (EntryType::File | EntryType::Symlink, Collect::EmptyFolders) => { #[cfg(target_family = "unix")] if directories.exclude_other_filesystems() { - match directories.is_on_other_filesystems(current_folder) { + match directories.is_on_other_filesystems(¤t_folder) { Ok(true) => continue 'dir, Err(e) => warnings.push(e.to_string()), _ => (), @@ -459,7 +462,7 @@ where &mut warnings, &mut fe_result, &allowed_extensions, - current_folder, + ¤t_folder, &directories, &excluded_items, ); @@ -477,8 +480,8 @@ where }) .collect(); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result, set_as_not_empty_folder_list, fe_list) in segments { @@ -492,7 +495,7 @@ where set_as_not_empty_folder(&mut folder_entries, current_folder); } for (path, entry) in fe_list { - folder_entries.insert(path, entry); + folder_entries.insert(path.to_string_lossy().to_string(), entry); } } } @@ -511,7 +514,7 @@ where warnings: all_warnings, }, Collect::EmptyFolders => DirTraversalResult::SuccessFolders { - folder_entries: folder_entries.into_iter().collect(), + folder_entries, warnings: all_warnings, }, } @@ -529,11 +532,7 @@ fn process_file_in_file_mode( minimal_file_size: u64, maximal_file_size: u64, ) { - let Some(file_name_lowercase) = get_lowercase_name(entry_data, warnings) else { - return; - }; - - if !allowed_extensions.matches_filename(&file_name_lowercase) { + if !allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return; } @@ -558,9 +557,9 @@ fn process_file_in_file_mode( if (minimal_file_size..=maximal_file_size).contains(&metadata.len()) { // Creating new file entry let fe: FileEntry = FileEntry { - path: current_file_name.clone(), size: metadata.len(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, hash: String::new(), symlink_info: None, }; @@ -601,9 +600,10 @@ fn process_dir_in_dir_mode( dir_result.push(next_folder.clone()); folder_entries_list.push(( - next_folder, + next_folder.clone(), FolderEntry { - parent_path: Some(current_folder.to_path_buf()), + path: next_folder, + parent_path: Some(current_folder.to_string_lossy().to_string()), is_empty: FolderEmptiness::Maybe, modified_date: get_modified_time(&metadata, warnings, current_folder, true), }, @@ -653,11 +653,7 @@ fn process_symlink_in_symlink_mode( directories: &Directories, excluded_items: &ExcludedItems, ) { - let Some(file_name_lowercase) = get_lowercase_name(entry_data, warnings) else { - return; - }; - - if !allowed_extensions.matches_filename(&file_name_lowercase) { + if !allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return; } @@ -716,8 +712,8 @@ fn process_symlink_in_symlink_mode( // Creating new file entry let fe: FileEntry = FileEntry { - path: current_file_name.clone(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, size: 0, hash: String::new(), symlink_info: Some(SymlinkInfo { destination_path, type_of_error }), @@ -733,7 +729,7 @@ pub fn common_read_dir(current_folder: &Path, warnings: &mut Vec) -> Opt Err(e) => { warnings.push(flc!( "core_cannot_open_dir", - generate_translation_hashmap(vec![("dir", current_folder.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("dir", current_folder.to_string_lossy().to_string()), ("reason", e.to_string())]) )); None } @@ -745,7 +741,7 @@ pub fn common_get_entry_data<'a>(entry: &'a Result, wa Err(e) => { warnings.push(flc!( "core_cannot_read_entry_dir", - generate_translation_hashmap(vec![("dir", current_folder.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("dir", current_folder.to_string_lossy().to_string()), ("reason", e.to_string())]) )); return None; } @@ -758,7 +754,7 @@ pub fn common_get_metadata_dir(entry_data: &DirEntry, warnings: &mut Vec Err(e) => { warnings.push(flc!( "core_cannot_read_metadata_dir", - generate_translation_hashmap(vec![("dir", current_folder.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("dir", current_folder.to_string_lossy().to_string()), ("reason", e.to_string())]) )); return None; } @@ -772,7 +768,7 @@ pub fn common_get_entry_data_metadata<'a>(entry: &'a Result { warnings.push(flc!( "core_cannot_read_entry_dir", - generate_translation_hashmap(vec![("dir", current_folder.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("dir", current_folder.to_string_lossy().to_string()), ("reason", e.to_string())]) )); return None; } @@ -782,7 +778,7 @@ pub fn common_get_entry_data_metadata<'a>(entry: &'a Result { warnings.push(flc!( "core_cannot_read_metadata_dir", - generate_translation_hashmap(vec![("dir", current_folder.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("dir", current_folder.to_string_lossy().to_string()), ("reason", e.to_string())]) )); return None; } @@ -795,7 +791,7 @@ pub fn get_modified_time(metadata: &Metadata, warnings: &mut Vec, curren Ok(t) => match t.duration_since(UNIX_EPOCH) { Ok(d) => d.as_secs(), Err(_inspected) => { - let translation_hashmap = generate_translation_hashmap(vec![("name", current_file_name.display().to_string())]); + let translation_hashmap = generate_translation_hashmap(vec![("name", current_file_name.to_string_lossy().to_string())]); if is_folder { warnings.push(flc!("core_folder_modified_before_epoch", translation_hashmap)); } else { @@ -805,7 +801,7 @@ pub fn get_modified_time(metadata: &Metadata, warnings: &mut Vec, curren } }, Err(e) => { - let translation_hashmap = generate_translation_hashmap(vec![("name", current_file_name.display().to_string()), ("reason", e.to_string())]); + let translation_hashmap = generate_translation_hashmap(vec![("name", current_file_name.to_string_lossy().to_string()), ("reason", e.to_string())]); if is_folder { warnings.push(flc!("core_folder_no_modification_date", translation_hashmap)); } else { @@ -822,7 +818,7 @@ pub fn get_lowercase_name(entry_data: &DirEntry, warnings: &mut Vec) -> Err(_inspected) => { warnings.push(flc!( "core_file_not_utf8_name", - generate_translation_hashmap(vec![("name", entry_data.path().display().to_string())]) + generate_translation_hashmap(vec![("name", entry_data.path().to_string_lossy().to_string())]) )); return None; } @@ -831,8 +827,8 @@ pub fn get_lowercase_name(entry_data: &DirEntry, warnings: &mut Vec) -> Some(name) } -fn set_as_not_empty_folder(folder_entries: &mut HashMap, current_folder: &Path) { - let mut d = folder_entries.get_mut(current_folder).unwrap(); +fn set_as_not_empty_folder(folder_entries: &mut HashMap, current_folder: &Path) { + let mut d = folder_entries.get_mut(current_folder.to_string_lossy().as_ref()).unwrap(); // Loop to recursively set as non empty this and all his parent folders loop { d.is_empty = FolderEmptiness::No; diff --git a/czkawka_core/src/common_directory.rs b/czkawka_core/src/common_directory.rs index b650a6f..b5e2a3e 100644 --- a/czkawka_core/src/common_directory.rs +++ b/czkawka_core/src/common_directory.rs @@ -107,7 +107,7 @@ impl Directories { if !is_excluded { messages.warnings.push(flc!( "core_directory_must_exists", - generate_translation_hashmap(vec![("path", directory.display().to_string())]) + generate_translation_hashmap(vec![("path", directory.to_string_lossy().to_string())]) )); } return (None, messages); @@ -116,7 +116,7 @@ impl Directories { if !directory.is_dir() { messages.warnings.push(flc!( "core_directory_must_be_directory", - generate_translation_hashmap(vec![("path", directory.display().to_string())]) + generate_translation_hashmap(vec![("path", directory.to_string_lossy().to_string())]) )); return (None, messages); } @@ -293,7 +293,7 @@ impl Directories { Ok(m) => self.included_dev_ids.push(m.dev()), Err(_) => messages.errors.push(flc!( "core_directory_unable_to_get_device_id", - generate_translation_hashmap(vec![("path", d.display().to_string())]) + generate_translation_hashmap(vec![("path", d.to_string_lossy().to_string())]) )), } } @@ -326,7 +326,7 @@ impl Directories { Ok(m) => Ok(!self.included_dev_ids.iter().any(|&id| id == m.dev())), Err(_) => Err(flc!( "core_directory_unable_to_get_device_id", - generate_translation_hashmap(vec![("path", path.display().to_string())]) + generate_translation_hashmap(vec![("path", path.to_string_lossy().to_string())]) )), } } diff --git a/czkawka_core/src/common_extensions.rs b/czkawka_core/src/common_extensions.rs index 2bfec8c..0c4d9d9 100644 --- a/czkawka_core/src/common_extensions.rs +++ b/czkawka_core/src/common_extensions.rs @@ -1,8 +1,10 @@ use crate::common_messages::Messages; +use std::collections::HashSet; +use std::fs::DirEntry; #[derive(Debug, Clone, Default)] pub struct Extensions { - file_extensions: Vec, + file_extensions_hashset: HashSet, } impl Extensions { @@ -28,26 +30,24 @@ impl Extensions { continue; } - if !extension.starts_with('.') { - extension = format!(".{extension}"); + if extension.starts_with('.') { + extension = extension[1..].to_string(); } - if extension[1..].contains('.') { + if extension.contains('.') { messages.warnings.push(format!("{extension} is not valid extension because contains dot inside")); continue; } - if extension[1..].contains(' ') { + if extension.contains(' ') { messages.warnings.push(format!("{extension} is not valid extension because contains empty space inside")); continue; } - if !self.file_extensions.contains(&extension) { - self.file_extensions.push(extension); - } + self.file_extensions_hashset.insert(extension); } - if self.file_extensions.is_empty() { + if self.file_extensions_hashset.is_empty() { messages .messages .push("No valid extensions were provided, so allowing all extensions by default.".to_string()); @@ -57,32 +57,36 @@ impl Extensions { pub fn matches_filename(&self, file_name: &str) -> bool { // assert_eq!(file_name, file_name.to_lowercase()); - if !self.file_extensions.is_empty() && !self.file_extensions.iter().any(|e| file_name.ends_with(e)) { + if !self.file_extensions_hashset.is_empty() && !self.file_extensions_hashset.iter().any(|e| file_name.ends_with(e)) { return false; } true } + pub fn check_if_entry_ends_with_extension(&self, entry_data: &DirEntry) -> bool { + if self.file_extensions_hashset.is_empty() { + return true; + } + + let file_name = entry_data.file_name(); + let Some(file_name_str) = file_name.to_str() else { return false }; + let Some(extension_idx) = file_name_str.rfind('.') else { return false }; + let extension = &file_name_str[extension_idx + 1..]; + + if extension.chars().all(|c| c.is_ascii_lowercase()) { + self.file_extensions_hashset.contains(extension) + } else { + self.file_extensions_hashset.contains(&extension.to_lowercase()) + } + } pub fn using_custom_extensions(&self) -> bool { - !self.file_extensions.is_empty() + !self.file_extensions_hashset.is_empty() } pub fn extend_allowed_extensions(&mut self, file_extensions: &[&str]) { for extension in file_extensions { - assert!(extension.starts_with('.')); - self.file_extensions.push((*extension).to_string()); + let extension_without_dot = extension.trim_start_matches('.'); + self.file_extensions_hashset.insert(extension_without_dot.to_string()); } } - - pub fn validate_allowed_extensions(&mut self, file_extensions: &[&str]) { - let mut current_file_extensions = Vec::new(); - - for extension in file_extensions { - assert!(extension.starts_with('.')); - if self.file_extensions.contains(&(*extension).to_string()) { - current_file_extensions.push((*extension).to_string()); - } - } - self.file_extensions = current_file_extensions; - } } diff --git a/czkawka_core/src/duplicate.rs b/czkawka_core/src/duplicate.rs index 9863136..048b3ac 100644 --- a/czkawka_core/src/duplicate.rs +++ b/czkawka_core/src/duplicate.rs @@ -103,7 +103,7 @@ impl DuplicateFinder { ignore_hard_links: true, hash_type: HashType::Blake3, use_prehash_cache: true, - minimal_cache_file_size: 1024 * 256, // By default cache only >= 256 KB files + minimal_cache_file_size: 1024 * 3256, // By default cache only >= 256 KB files minimal_prehash_cache_file_size: 0, case_sensitive_name_comparison: false, } @@ -522,7 +522,7 @@ impl DuplicateFinder { .map(|(size, vec_file_entry)| { let mut hashmap_with_hash: BTreeMap> = Default::default(); let mut errors: Vec = Vec::new(); - let mut buffer = [0u8; 1024 * 2]; + let mut buffer = [0u8; 1024 * 32]; atomic_counter.fetch_add(vec_file_entry.len(), Ordering::Relaxed); if check_if_stop_received(stop_receiver) { @@ -1002,7 +1002,7 @@ impl PrintResults for DuplicateFinder { for (name, vector) in self.files_with_identical_names.iter().rev() { writeln!(writer, "Name - {} - {} files ", name, vector.len())?; for j in vector { - writeln!(writer, "{}", j.path.display())?; + writeln!(writer, "{:?}", j.path)?; } writeln!(writer)?; } @@ -1018,9 +1018,9 @@ impl PrintResults for DuplicateFinder { )?; for (name, (file_entry, vector)) in self.files_with_identical_names_referenced.iter().rev() { writeln!(writer, "Name - {} - {} files ", name, vector.len())?; - writeln!(writer, "Reference file - {}", file_entry.path.display())?; + writeln!(writer, "Reference file - {:?}", file_entry.path)?; for j in vector { - writeln!(writer, "{}", j.path.display())?; + writeln!(writer, "{:?}", j.path)?; } writeln!(writer)?; } @@ -1042,7 +1042,7 @@ impl PrintResults for DuplicateFinder { for ((size, name), vector) in self.files_with_identical_size_names.iter().rev() { writeln!(writer, "Name - {}, {} - {} files ", name, format_size(*size, BINARY), vector.len())?; for j in vector { - writeln!(writer, "{}", j.path.display())?; + writeln!(writer, "{:?}", j.path)?; } writeln!(writer)?; } @@ -1058,9 +1058,9 @@ impl PrintResults for DuplicateFinder { )?; for ((size, name), (file_entry, vector)) in self.files_with_identical_size_names_referenced.iter().rev() { writeln!(writer, "Name - {}, {} - {} files ", name, format_size(*size, BINARY), vector.len())?; - writeln!(writer, "Reference file - {}", file_entry.path.display())?; + writeln!(writer, "Reference file - {:?}", file_entry.path)?; for j in vector { - writeln!(writer, "{}", j.path.display())?; + writeln!(writer, "{:?}", j.path)?; } writeln!(writer)?; } @@ -1084,7 +1084,7 @@ impl PrintResults for DuplicateFinder { for (size, vector) in self.files_with_identical_size.iter().rev() { write!(writer, "\n---- Size {} ({}) - {} files \n", format_size(*size, BINARY), size, vector.len())?; for file_entry in vector { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } } } else if !self.files_with_identical_size_referenced.is_empty() { @@ -1101,9 +1101,9 @@ impl PrintResults for DuplicateFinder { )?; for (size, (file_entry, vector)) in self.files_with_identical_size_referenced.iter().rev() { writeln!(writer, "\n---- Size {} ({}) - {} files", format_size(*size, BINARY), size, vector.len())?; - writeln!(writer, "Reference file - {}", file_entry.path.display())?; + writeln!(writer, "Reference file - {:?}", file_entry.path)?; for file_entry in vector { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } } } else { @@ -1127,7 +1127,7 @@ impl PrintResults for DuplicateFinder { for vector in vectors_vector { writeln!(writer, "\n---- Size {} ({}) - {} files", format_size(*size, BINARY), size, vector.len())?; for file_entry in vector { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } } } @@ -1146,9 +1146,9 @@ impl PrintResults for DuplicateFinder { for (size, vectors_vector) in self.files_with_identical_hashes_referenced.iter().rev() { for (file_entry, vector) in vectors_vector { writeln!(writer, "\n---- Size {} ({}) - {} files", format_size(*size, BINARY), size, vector.len())?; - writeln!(writer, "Reference file - {}", file_entry.path.display())?; + writeln!(writer, "Reference file - {:?}", file_entry.path)?; for file_entry in vector { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } } } @@ -1226,7 +1226,7 @@ pub trait MyHasher { fn hash_calculation(buffer: &mut [u8], file_entry: &FileEntry, hash_type: &HashType, limit: u64) -> Result { let mut file_handler = match File::open(&file_entry.path) { Ok(t) => t, - Err(e) => return Err(format!("Unable to check hash of file {}, reason {}", file_entry.path.display(), e)), + Err(e) => return Err(format!("Unable to check hash of file {:?}, reason {e}", file_entry.path)), }; let hasher = &mut *hash_type.hasher(); let mut current_file_read_bytes: u64 = 0; @@ -1234,7 +1234,7 @@ fn hash_calculation(buffer: &mut [u8], file_entry: &FileEntry, hash_type: &HashT let n = match file_handler.read(buffer) { Ok(0) => break, Ok(t) => t, - Err(e) => return Err(format!("Error happened when checking hash of file {}, reason {}", file_entry.path.display(), e)), + Err(e) => return Err(format!("Error happened when checking hash of file {:?}, reason {}", file_entry.path, e)), }; current_file_read_bytes += n as u64; diff --git a/czkawka_core/src/empty_files.rs b/czkawka_core/src/empty_files.rs index 68b1d3a..fc9dd44 100644 --- a/czkawka_core/src/empty_files.rs +++ b/czkawka_core/src/empty_files.rs @@ -90,7 +90,7 @@ impl EmptyFiles { DeleteMethod::Delete => { for file_entry in &self.empty_files { if fs::remove_file(file_entry.path.clone()).is_err() { - self.common_data.text_messages.warnings.push(file_entry.path.display().to_string()); + self.common_data.text_messages.warnings.push(file_entry.path.to_string_lossy().to_string()); } } } @@ -135,7 +135,7 @@ impl PrintResults for EmptyFiles { if !self.empty_files.is_empty() { writeln!(writer, "Found {} empty files.", self.information.number_of_empty_files)?; for file_entry in &self.empty_files { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } } else { write!(writer, "Not found any empty files.")?; diff --git a/czkawka_core/src/empty_folder.rs b/czkawka_core/src/empty_folder.rs index 2d7a09f..1ac9210 100644 --- a/czkawka_core/src/empty_folder.rs +++ b/czkawka_core/src/empty_folder.rs @@ -1,7 +1,6 @@ -use std::collections::BTreeMap; +use std::collections::HashMap; use std::fs; use std::io::Write; -use std::path::PathBuf; use crossbeam_channel::{Receiver, Sender}; use fun_time::fun_time; @@ -15,7 +14,7 @@ use crate::common_traits::{DebugPrint, PrintResults}; pub struct EmptyFolder { common_data: CommonToolData, information: Info, - empty_folder_list: BTreeMap, // Path, FolderEntry + empty_folder_list: HashMap, // Path, FolderEntry } #[derive(Default)] @@ -32,7 +31,7 @@ impl EmptyFolder { } } - pub const fn get_empty_folder_list(&self) -> &BTreeMap { + pub const fn get_empty_folder_list(&self) -> &HashMap { &self.empty_folder_list } @@ -54,7 +53,7 @@ impl EmptyFolder { } fn optimize_folders(&mut self) { - let mut new_directory_folders: BTreeMap = Default::default(); + let mut new_directory_folders: HashMap = Default::default(); for (name, folder_entry) in &self.empty_folder_list { match &folder_entry.parent_path { @@ -151,8 +150,10 @@ impl PrintResults for EmptyFolder { if !self.empty_folder_list.is_empty() { writeln!(writer, "--------------------------Empty folder list--------------------------")?; writeln!(writer, "Found {} empty folders", self.information.number_of_empty_folders)?; - for name in self.empty_folder_list.keys() { - writeln!(writer, "{}", name.display())?; + let mut empty_folder_list = self.empty_folder_list.keys().collect::>(); + empty_folder_list.sort_unstable(); + for name in empty_folder_list { + writeln!(writer, "{name}")?; } } else { write!(writer, "Not found any empty folders.")?; diff --git a/czkawka_core/src/invalid_symlinks.rs b/czkawka_core/src/invalid_symlinks.rs index 574af56..764e094 100644 --- a/czkawka_core/src/invalid_symlinks.rs +++ b/czkawka_core/src/invalid_symlinks.rs @@ -75,7 +75,7 @@ impl InvalidSymlinks { DeleteMethod::Delete => { for file_entry in &self.invalid_symlinks { if fs::remove_file(file_entry.path.clone()).is_err() { - self.common_data.text_messages.warnings.push(file_entry.path.display().to_string()); + self.common_data.text_messages.warnings.push(file_entry.path.to_string_lossy().to_string()); } } } @@ -112,9 +112,9 @@ impl PrintResults for InvalidSymlinks { for file_entry in &self.invalid_symlinks { writeln!( writer, - "{}\t\t{}\t\t{}", - file_entry.path.display(), - file_entry.symlink_info.clone().expect("invalid traversal result").destination_path.display(), + "{:?}\t\t{:?}\t\t{}", + file_entry.path, + file_entry.symlink_info.clone().expect("invalid traversal result").destination_path, match file_entry.symlink_info.clone().expect("invalid traversal result").type_of_error { ErrorType::InfiniteRecursion => "Infinite Recursion", ErrorType::NonExistentFile => "Non Existent File", diff --git a/czkawka_core/src/same_music.rs b/czkawka_core/src/same_music.rs index 89e5fab..4a8b5de 100644 --- a/czkawka_core/src/same_music.rs +++ b/czkawka_core/src/same_music.rs @@ -180,7 +180,7 @@ impl SameMusic { if !self.common_data.allowed_extensions.using_custom_extensions() { self.common_data.allowed_extensions.extend_allowed_extensions(AUDIO_FILES_EXTENSIONS); } else { - self.common_data.allowed_extensions.validate_allowed_extensions(AUDIO_FILES_EXTENSIONS); + self.common_data.allowed_extensions.extend_allowed_extensions(AUDIO_FILES_EXTENSIONS); if !self.common_data.allowed_extensions.using_custom_extensions() { return true; } @@ -242,7 +242,7 @@ impl SameMusic { debug!("load_cache - Starting to check for differences"); for (name, file_entry) in mem::take(&mut self.music_to_check) { if let Some(cached_file_entry) = loaded_hash_map.get(&name) { - records_already_cached.insert(name.clone(), cached_file_entry.clone()); + records_already_cached.insert(name, cached_file_entry.clone()); } else { non_cached_files_to_check.insert(name, file_entry); } @@ -622,7 +622,7 @@ impl SameMusic { music_entries.push(entry.clone()); } used_paths.insert(f_string); - music_entries.push(f_entry.clone()); + music_entries.push(f_entry); duplicated_music_entries.push(music_entries); } } @@ -955,14 +955,8 @@ impl PrintResults for SameMusic { for file_entry in vec_file_entry { writeln!( writer, - "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {}", - file_entry.track_title, - file_entry.track_artist, - file_entry.year, - file_entry.length, - file_entry.genre, - file_entry.bitrate, - file_entry.path.display() + "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {:?}", + file_entry.track_title, file_entry.track_artist, file_entry.year, file_entry.length, file_entry.genre, file_entry.bitrate, file_entry.path )?; } writeln!(writer)?; @@ -974,26 +968,14 @@ impl PrintResults for SameMusic { writeln!(writer)?; writeln!( writer, - "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {}", - file_entry.track_title, - file_entry.track_artist, - file_entry.year, - file_entry.length, - file_entry.genre, - file_entry.bitrate, - file_entry.path.display() + "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {:?}", + file_entry.track_title, file_entry.track_artist, file_entry.year, file_entry.length, file_entry.genre, file_entry.bitrate, file_entry.path )?; for file_entry in vec_file_entry { writeln!( writer, - "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {}", - file_entry.track_title, - file_entry.track_artist, - file_entry.year, - file_entry.length, - file_entry.genre, - file_entry.bitrate, - file_entry.path.display() + "TT: {} - TA: {} - Y: {} - L: {} - G: {} - B: {} - P: {:?}", + file_entry.track_title, file_entry.track_artist, file_entry.year, file_entry.length, file_entry.genre, file_entry.bitrate, file_entry.path )?; } writeln!(writer)?; diff --git a/czkawka_core/src/similar_images.rs b/czkawka_core/src/similar_images.rs index 1b100ee..aed5560 100644 --- a/czkawka_core/src/similar_images.rs +++ b/czkawka_core/src/similar_images.rs @@ -24,7 +24,7 @@ use crate::common::{ send_info_and_wait_for_ending_all_threads, HEIC_EXTENSIONS, IMAGE_RS_SIMILAR_IMAGES_EXTENSIONS, RAW_IMAGE_EXTENSIONS, }; use crate::common_cache::{get_similar_images_cache_file, load_cache_from_file_generalized_by_path, save_cache_to_file_generalized}; -use crate::common_dir_traversal::{common_read_dir, get_lowercase_name, get_modified_time, CheckingMethod, ProgressData, ToolType}; +use crate::common_dir_traversal::{common_read_dir, get_modified_time, CheckingMethod, ProgressData, ToolType}; use crate::common_tool::{CommonData, CommonToolData, DeleteMethod}; use crate::common_traits::{DebugPrint, PrintResults, ResultEntry}; use crate::flc; @@ -146,7 +146,7 @@ impl SimilarImages { #[fun_time(message = "check_for_similar_images", level = "debug")] fn check_for_similar_images(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&Sender>) -> bool { - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); + let mut folders_to_check: Vec = self.common_data.directories.included_directories.clone(); if !self.common_data.allowed_extensions.using_custom_extensions() { self.common_data.allowed_extensions.extend_allowed_extensions(IMAGE_RS_SIMILAR_IMAGES_EXTENSIONS); @@ -156,7 +156,7 @@ impl SimilarImages { } else { self.common_data .allowed_extensions - .validate_allowed_extensions(&[IMAGE_RS_SIMILAR_IMAGES_EXTENSIONS, RAW_IMAGE_EXTENSIONS, HEIC_EXTENSIONS].concat()); + .extend_allowed_extensions(&[IMAGE_RS_SIMILAR_IMAGES_EXTENSIONS, RAW_IMAGE_EXTENSIONS, HEIC_EXTENSIONS].concat()); if !self.common_data.allowed_extensions.using_custom_extensions() { return true; } @@ -177,13 +177,13 @@ impl SimilarImages { } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; let mut fe_result = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { return (dir_result, warnings, fe_result); }; @@ -194,12 +194,11 @@ impl SimilarImages { let Ok(file_type) = entry_data.file_type() else { continue; }; - if file_type.is_dir() { check_folder_children( &mut dir_result, &mut warnings, - current_folder, + ¤t_folder, &entry_data, self.common_data.recursive_search, &self.common_data.directories, @@ -207,15 +206,15 @@ impl SimilarImages { ); } else if file_type.is_file() { atomic_counter.fetch_add(1, Ordering::Relaxed); - self.add_file_entry(current_folder, &entry_data, &mut fe_result, &mut warnings); + self.add_file_entry(¤t_folder, &entry_data, &mut fe_result, &mut warnings); } } (dir_result, warnings, fe_result) }) .collect(); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result) in segments { @@ -226,6 +225,8 @@ impl SimilarImages { } } } + eprintln!("Tested {} files", atomic_counter.load(Ordering::Relaxed)); + eprintln!("Imagest to check {}", self.images_to_check.len()); send_info_and_wait_for_ending_all_threads(&progress_thread_run, progress_thread_handle); @@ -233,11 +234,7 @@ impl SimilarImages { } fn add_file_entry(&self, current_folder: &Path, entry_data: &DirEntry, fe_result: &mut Vec<(String, FileEntry)>, warnings: &mut Vec) { - let Some(file_name_lowercase) = get_lowercase_name(entry_data, warnings) else { - return; - }; - - if !self.common_data.allowed_extensions.matches_filename(&file_name_lowercase) { + if !self.common_data.allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return; } @@ -284,7 +281,7 @@ impl SimilarImages { debug!("hash_images-load_cache - starting calculating diff"); for (name, file_entry) in mem::take(&mut self.images_to_check) { if let Some(cached_file_entry) = loaded_hash_map.get(&name) { - records_already_cached.insert(name.clone(), cached_file_entry.clone()); + records_already_cached.insert(name, cached_file_entry.clone()); } else { non_cached_files_to_check.insert(name, file_entry); } @@ -601,6 +598,7 @@ impl SimilarImages { }) .collect::>(); + // Sort by tolerance found_items.sort_unstable_by_key(|f| f.0); Some((hash_to_check, found_items)) }) @@ -858,8 +856,8 @@ impl PrintResults for SimilarImages { for file_entry in struct_similar { writeln!( writer, - "{} - {} - {} - {}", - file_entry.path.display(), + "{:?} - {} - {} - {}", + file_entry.path, file_entry.dimensions, format_size(file_entry.size, BINARY), get_string_from_similarity(&file_entry.similarity, self.hash_size) @@ -875,8 +873,8 @@ impl PrintResults for SimilarImages { writeln!(writer)?; writeln!( writer, - "{} - {} - {} - {}", - file_entry.path.display(), + "{:?} - {} - {} - {}", + file_entry.path, file_entry.dimensions, format_size(file_entry.size, BINARY), get_string_from_similarity(&file_entry.similarity, self.hash_size) @@ -884,8 +882,8 @@ impl PrintResults for SimilarImages { for file_entry in vec_file_entry { writeln!( writer, - "{} - {} - {} - {}", - file_entry.path.display(), + "{:?} - {} - {} - {}", + file_entry.path, file_entry.dimensions, format_size(file_entry.size, BINARY), get_string_from_similarity(&file_entry.similarity, self.hash_size) diff --git a/czkawka_core/src/similar_videos.rs b/czkawka_core/src/similar_videos.rs index d6fc80e..e2525b7 100644 --- a/czkawka_core/src/similar_videos.rs +++ b/czkawka_core/src/similar_videos.rs @@ -18,7 +18,7 @@ use crate::common::{ check_folder_children, check_if_stop_received, delete_files_custom, prepare_thread_handler_common, send_info_and_wait_for_ending_all_threads, VIDEO_FILES_EXTENSIONS, }; use crate::common_cache::{get_similar_videos_cache_file, load_cache_from_file_generalized_by_path, save_cache_to_file_generalized}; -use crate::common_dir_traversal::{common_read_dir, get_lowercase_name, get_modified_time, CheckingMethod, ProgressData, ToolType}; +use crate::common_dir_traversal::{common_read_dir, get_modified_time, CheckingMethod, ProgressData, ToolType}; use crate::common_tool::{CommonData, CommonToolData, DeleteMethod}; use crate::common_traits::{DebugPrint, PrintResults, ResultEntry}; use crate::flc; @@ -130,22 +130,17 @@ impl SimilarVideos { #[fun_time(message = "check_for_similar_videos", level = "debug")] fn check_for_similar_videos(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&Sender>) -> bool { - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); + let mut folders_to_check: Vec = self.common_data.directories.included_directories.clone(); if !self.common_data.allowed_extensions.using_custom_extensions() { self.common_data.allowed_extensions.extend_allowed_extensions(VIDEO_FILES_EXTENSIONS); } else { - self.common_data.allowed_extensions.validate_allowed_extensions(VIDEO_FILES_EXTENSIONS); + self.common_data.allowed_extensions.extend_allowed_extensions(VIDEO_FILES_EXTENSIONS); if !self.common_data.allowed_extensions.using_custom_extensions() { return true; } } - // Add root folders for finding - for id in &self.common_data.directories.included_directories { - folders_to_check.push(id.clone()); - } - let (progress_thread_handle, progress_thread_run, atomic_counter, _check_was_stopped) = prepare_thread_handler_common(progress_sender, 0, 1, 0, CheckingMethod::None, self.common_data.tool_type); @@ -156,13 +151,13 @@ impl SimilarVideos { } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; let mut fe_result = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { return (dir_result, warnings, fe_result); }; @@ -179,7 +174,7 @@ impl SimilarVideos { check_folder_children( &mut dir_result, &mut warnings, - current_folder, + ¤t_folder, &entry_data, self.common_data.recursive_search, &self.common_data.directories, @@ -187,15 +182,15 @@ impl SimilarVideos { ); } else if file_type.is_file() { atomic_counter.fetch_add(1, Ordering::Relaxed); - self.add_video_file_entry(&entry_data, &mut fe_result, &mut warnings, current_folder); + self.add_video_file_entry(&entry_data, &mut fe_result, &mut warnings, ¤t_folder); } } (dir_result, warnings, fe_result) }) .collect(); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result) in segments { @@ -213,11 +208,7 @@ impl SimilarVideos { } fn add_video_file_entry(&self, entry_data: &DirEntry, fe_result: &mut Vec<(String, FileEntry)>, warnings: &mut Vec, current_folder: &Path) { - let Some(file_name_lowercase) = get_lowercase_name(entry_data, warnings) else { - return; - }; - - if !self.common_data.allowed_extensions.matches_filename(&file_name_lowercase) { + if !self.common_data.allowed_extensions.check_if_entry_ends_with_extension(entry_data) { return; } @@ -234,9 +225,9 @@ impl SimilarVideos { // Checking files if (self.common_data.minimal_file_size..=self.common_data.maximal_file_size).contains(&metadata.len()) { let fe: FileEntry = FileEntry { - path: current_file_name.clone(), size: metadata.len(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, vhash: Default::default(), error: String::new(), }; @@ -259,7 +250,7 @@ impl SimilarVideos { for (name, file_entry) in mem::take(&mut self.videos_to_check) { if let Some(cached_file_entry) = loaded_hash_map.get(&name) { - records_already_cached.insert(name.clone(), cached_file_entry.clone()); + records_already_cached.insert(name, cached_file_entry.clone()); } else { non_cached_files_to_check.insert(name, file_entry); } @@ -449,7 +440,7 @@ impl PrintResults for SimilarVideos { for struct_similar in &self.similar_vectors { writeln!(writer, "Found {} videos which have similar friends", struct_similar.len())?; for file_entry in struct_similar { - writeln!(writer, "{} - {}", file_entry.path.display(), format_size(file_entry.size, BINARY))?; + writeln!(writer, "{:?} - {}", file_entry.path, format_size(file_entry.size, BINARY))?; } writeln!(writer)?; } @@ -459,9 +450,9 @@ impl PrintResults for SimilarVideos { for (fe, struct_similar) in &self.similar_referenced_vectors { writeln!(writer, "Found {} videos which have similar friends", struct_similar.len())?; writeln!(writer)?; - writeln!(writer, "{} - {}", fe.path.display(), format_size(fe.size, BINARY))?; + writeln!(writer, "{:?} - {}", fe.path, format_size(fe.size, BINARY))?; for file_entry in struct_similar { - writeln!(writer, "{} - {}", file_entry.path.display(), format_size(file_entry.size, BINARY))?; + writeln!(writer, "{:?} - {}", file_entry.path, format_size(file_entry.size, BINARY))?; } writeln!(writer)?; } diff --git a/czkawka_core/src/temporary.rs b/czkawka_core/src/temporary.rs index 7ff332a..22df1df 100644 --- a/czkawka_core/src/temporary.rs +++ b/czkawka_core/src/temporary.rs @@ -71,12 +71,7 @@ impl Temporary { #[fun_time(message = "check_files", level = "debug")] fn check_files(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&Sender>) -> bool { - let mut folders_to_check: Vec = Vec::with_capacity(1024 * 2); - - // Add root folders for finding - for id in &self.common_data.directories.included_directories { - folders_to_check.push(id.clone()); - } + let mut folders_to_check: Vec = self.common_data.directories.included_directories.clone(); let (progress_thread_handle, progress_thread_run, atomic_counter, _check_was_stopped) = prepare_thread_handler_common(progress_sender, 0, 0, 0, CheckingMethod::None, self.common_data.tool_type); @@ -88,13 +83,13 @@ impl Temporary { } let segments: Vec<_> = folders_to_check - .par_iter() + .into_par_iter() .map(|current_folder| { let mut dir_result = vec![]; let mut warnings = vec![]; let mut fe_result = vec![]; - let Some(read_dir) = common_read_dir(current_folder, &mut warnings) else { + let Some(read_dir) = common_read_dir(¤t_folder, &mut warnings) else { return (dir_result, warnings, fe_result); }; @@ -111,14 +106,14 @@ impl Temporary { check_folder_children( &mut dir_result, &mut warnings, - current_folder, + ¤t_folder, &entry_data, self.common_data.recursive_search, &self.common_data.directories, &self.common_data.excluded_items, ); } else if file_type.is_file() { - if let Some(file_entry) = self.get_file_entry(&atomic_counter, &entry_data, &mut warnings, current_folder) { + if let Some(file_entry) = self.get_file_entry(&atomic_counter, &entry_data, &mut warnings, ¤t_folder) { fe_result.push(file_entry); } } @@ -127,8 +122,8 @@ impl Temporary { }) .collect(); - // Advance the frontier - folders_to_check.clear(); + let required_size = segments.iter().map(|(segment, _, _)| segment.len()).sum::(); + folders_to_check = Vec::with_capacity(required_size); // Process collected data for (segment, warnings, fe_result) in segments { @@ -164,8 +159,8 @@ impl Temporary { // Creating new file entry Some(FileEntry { - path: current_file_name.clone(), modified_date: get_modified_time(&metadata, warnings, ¤t_file_name, false), + path: current_file_name, }) } @@ -176,7 +171,7 @@ impl Temporary { let mut warnings = Vec::new(); for file_entry in &self.temporary_files { if fs::remove_file(file_entry.path.clone()).is_err() { - warnings.push(file_entry.path.display().to_string()); + warnings.push(file_entry.path.to_string_lossy().to_string()); } } self.common_data.text_messages.warnings.extend(warnings); @@ -201,7 +196,7 @@ impl PrintResults for Temporary { writeln!(writer, "Found {} temporary files.\n", self.information.number_of_temporary_files)?; for file_entry in &self.temporary_files { - writeln!(writer, "{}", file_entry.path.display())?; + writeln!(writer, "{:?}", file_entry.path)?; } Ok(()) diff --git a/czkawka_gui/src/compute_results.rs b/czkawka_gui/src/compute_results.rs index 3bb4c12..70b4b43 100644 --- a/czkawka_gui/src/compute_results.rs +++ b/czkawka_gui/src/compute_results.rs @@ -1,6 +1,5 @@ use std::cell::RefCell; use std::collections::HashMap; -use std::path::PathBuf; use std::rc::Rc; use chrono::NaiveDateTime; @@ -13,7 +12,7 @@ use humansize::{format_size, BINARY}; use czkawka_core::bad_extensions::BadExtensions; use czkawka_core::big_file::BigFile; use czkawka_core::broken_files::BrokenFiles; -use czkawka_core::common::split_path; +use czkawka_core::common::{split_path, split_path_compare}; use czkawka_core::common_dir_traversal::{CheckingMethod, FileEntry}; use czkawka_core::common_tool::CommonData; use czkawka_core::duplicate::DuplicateFinder; @@ -265,10 +264,7 @@ fn computer_bad_extensions( // Sort let mut vector = vector.clone(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); for file_entry in vector { let (directory, file) = split_path(&file_entry.path); @@ -340,10 +336,7 @@ fn computer_broken_files( // Sort let mut vector = vector.clone(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); for file_entry in vector { let (directory, file) = split_path(&file_entry.path); @@ -506,10 +499,7 @@ fn computer_same_music( // Sort let vec_file_entry = if vec_file_entry.len() >= 2 { let mut vec_file_entry = vec_file_entry.clone(); - vec_file_entry.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vec_file_entry.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vec_file_entry } else { vec_file_entry.clone() @@ -561,10 +551,7 @@ fn computer_same_music( // Sort let vec_file_entry = if vec_file_entry.len() >= 2 { let mut vec_file_entry = vec_file_entry.clone(); - vec_file_entry.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vec_file_entry.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vec_file_entry } else { vec_file_entry.clone() @@ -670,10 +657,7 @@ fn computer_similar_videos( // Sort let vec_file_entry = if vec_file_entry.len() >= 2 { let mut vec_file_entry = vec_file_entry.clone(); - vec_file_entry.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vec_file_entry.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vec_file_entry } else { vec_file_entry.clone() @@ -692,10 +676,7 @@ fn computer_similar_videos( // Sort let vec_file_entry = if vec_file_entry.len() >= 2 { let mut vec_file_entry = vec_file_entry.clone(); - vec_file_entry.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vec_file_entry.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vec_file_entry } else { vec_file_entry.clone() @@ -895,10 +876,7 @@ fn computer_temporary_files( // Sort // TODO maybe simplify this via common file entry let mut vector = vector.clone(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); for file_entry in vector { let (directory, file) = split_path(&file_entry.path); @@ -1100,24 +1078,21 @@ fn computer_empty_folders( let list_store = get_list_store(tree_view); let hashmap = ef.get_empty_folder_list(); - let mut vector = hashmap.keys().cloned().collect::>(); + let mut vector = hashmap.values().collect::>(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); - for path in vector { - let (directory, file) = split_path(&path); + for fe in vector { + let (directory, file) = split_path(&fe.path); let values: [(u32, &dyn ToValue); COLUMNS_NUMBER] = [ (ColumnsEmptyFolders::SelectionButton as u32, &false), (ColumnsEmptyFolders::Name as u32, &file), (ColumnsEmptyFolders::Path as u32, &directory), ( ColumnsEmptyFolders::Modification as u32, - &(NaiveDateTime::from_timestamp_opt(hashmap.get(&path).unwrap().modified_date as i64, 0).unwrap().to_string()), + &(NaiveDateTime::from_timestamp_opt(fe.modified_date as i64, 0).unwrap().to_string()), ), - (ColumnsEmptyFolders::ModificationAsSecs as u32, &(hashmap.get(&path).unwrap().modified_date)), + (ColumnsEmptyFolders::ModificationAsSecs as u32, &(fe.modified_date)), ]; list_store.set(&list_store.append(), &values); } @@ -1353,10 +1328,7 @@ fn computer_duplicate_finder( fn vector_sort_unstable_entry_by_path(vector: &Vec) -> Vec { if vector.len() >= 2 { let mut vector = vector.clone(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vector } else { vector.clone() @@ -1365,10 +1337,7 @@ fn vector_sort_unstable_entry_by_path(vector: &Vec) -> Vec fn vector_sort_simple_unstable_entry_by_path(vector: &[FileEntry]) -> Vec { let mut vector = vector.to_owned(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.path.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); vector } diff --git a/czkawka_gui/src/saving_loading.rs b/czkawka_gui/src/saving_loading.rs index 281c8c2..cd81d87 100644 --- a/czkawka_gui/src/saving_loading.rs +++ b/czkawka_gui/src/saving_loading.rs @@ -226,7 +226,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_folder_config_instead_file", - generate_translation_hashmap(vec![("path", config_dir.display().to_string())]) + generate_translation_hashmap(vec![("path", config_dir.to_string_lossy().to_string())]) ), ); return None; @@ -236,7 +236,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_failed_to_create_configuration_folder", - generate_translation_hashmap(vec![("path", config_dir.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("path", config_dir.to_string_lossy().to_string()), ("reason", e.to_string())]) ), ); return None; @@ -249,7 +249,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_failed_to_create_config_file", - generate_translation_hashmap(vec![("path", config_file.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("path", config_file.to_string_lossy().to_string()), ("reason", e.to_string())]) ), ); return None; @@ -264,7 +264,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_failed_to_read_config_file", - generate_translation_hashmap(vec![("path", config_file.display().to_string())]) + generate_translation_hashmap(vec![("path", config_file.to_string_lossy().to_string())]) ), ); } @@ -278,7 +278,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_failed_to_create_config_file", - generate_translation_hashmap(vec![("path", config_file.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("path", config_file.to_string_lossy().to_string()), ("reason", e.to_string())]) ), ); return None; @@ -299,7 +299,7 @@ impl LoadSaveStruct { text_view_errors, &flg!( "saving_loading_failed_to_read_data_from_file", - generate_translation_hashmap(vec![("path", config_file.display().to_string()), ("reason", e.to_string())]) + generate_translation_hashmap(vec![("path", config_file.to_string_lossy().to_string()), ("reason", e.to_string())]) ), ); return; @@ -370,7 +370,7 @@ impl LoadSaveStruct { text_view_errors, flg!( "saving_loading_saving_success", - generate_translation_hashmap(vec![("name", config_file.display().to_string())]) + generate_translation_hashmap(vec![("name", config_file.to_string_lossy().to_string())]) ) .as_str(), ); @@ -379,7 +379,7 @@ impl LoadSaveStruct { text_view_errors, flg!( "saving_loading_saving_failure", - generate_translation_hashmap(vec![("name", config_file.display().to_string())]) + generate_translation_hashmap(vec![("name", config_file.to_string_lossy().to_string())]) ) .as_str(), ); diff --git a/krokiet/src/connect_scan.rs b/krokiet/src/connect_scan.rs index 7e3c834..4f9a3c4 100644 --- a/krokiet/src/connect_scan.rs +++ b/krokiet/src/connect_scan.rs @@ -2,8 +2,8 @@ use crate::settings::{collect_settings, SettingsCustom, ALLOWED_HASH_TYPE_VALUES use crate::{CurrentTab, GuiState, MainListModel, MainWindow, ProgressToSend}; use chrono::NaiveDateTime; use crossbeam_channel::{Receiver, Sender}; -use czkawka_core::common::{split_path, DEFAULT_THREAD_SIZE}; -use czkawka_core::common_dir_traversal::ProgressData; +use czkawka_core::common::{split_path, split_path_compare, DEFAULT_THREAD_SIZE}; +use czkawka_core::common_dir_traversal::{FileEntry, FolderEntry, ProgressData}; use czkawka_core::common_tool::CommonData; use czkawka_core::common_traits::ResultEntry; use czkawka_core::empty_files::EmptyFiles; @@ -12,7 +12,6 @@ use czkawka_core::similar_images; use czkawka_core::similar_images::SimilarImages; use humansize::{format_size, BINARY}; use slint::{ComponentHandle, ModelRc, SharedString, VecModel, Weak}; -use std::path::PathBuf; use std::rc::Rc; use std::thread; @@ -81,31 +80,34 @@ fn scan_similar_images(a: Weak, progress_sender: Sender().set_info_text(messages.into()); + write_similar_images_results(&app, vector, messages, hash_size); }) }) .unwrap(); } +fn write_similar_images_results(app: &MainWindow, vector: Vec>, messages: String, hash_size: u8) { + let items_found = vector.len(); + let items = Rc::new(VecModel::default()); + for vec_fe in vector { + insert_data_to_model(&items, ModelRc::new(VecModel::default()), true); + for fe in vec_fe { + let (directory, file) = split_path(fe.get_path()); + let data_model = VecModel::from_slice(&[ + similar_images::get_string_from_similarity(&fe.similarity, hash_size).into(), + format_size(fe.size, BINARY).into(), + fe.dimensions.clone().into(), + file.into(), + directory.into(), + NaiveDateTime::from_timestamp_opt(fe.get_modified_date() as i64, 0).unwrap().to_string().into(), + ]); + + insert_data_to_model(&items, data_model, false); + } + } + app.set_similar_images_model(items.into()); + app.invoke_scan_ended(format!("Found {items_found} similar images files").into()); + app.global::().set_info_text(messages.into()); +} fn scan_empty_files(a: Weak, progress_sender: Sender, stop_receiver: Receiver<()>, custom_settings: SettingsCustom) { thread::Builder::new() @@ -118,31 +120,31 @@ fn scan_empty_files(a: Weak, progress_sender: Sender, let mut vector = finder.get_empty_files().clone(); let messages = finder.get_text_messages().create_messages_text(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.get_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); a.upgrade_in_event_loop(move |app| { - let number_of_empty_files = vector.len(); - let items = Rc::new(VecModel::default()); - for fe in vector { - let (directory, file) = split_path(fe.get_path()); - let data_model = VecModel::from_slice(&[ - file.into(), - directory.into(), - NaiveDateTime::from_timestamp_opt(fe.get_modified_date() as i64, 0).unwrap().to_string().into(), - ]); - - insert_data_to_model(&items, data_model, false); - } - app.set_empty_files_model(items.into()); - app.invoke_scan_ended(format!("Found {} empty files", number_of_empty_files).into()); - app.global::().set_info_text(messages.into()); + write_empty_files_results(&app, vector, messages); }) }) .unwrap(); } +fn write_empty_files_results(app: &MainWindow, vector: Vec, messages: String) { + let items_found = vector.len(); + let items = Rc::new(VecModel::default()); + for fe in vector { + let (directory, file) = split_path(fe.get_path()); + let data_model = VecModel::from_slice(&[ + file.into(), + directory.into(), + NaiveDateTime::from_timestamp_opt(fe.get_modified_date() as i64, 0).unwrap().to_string().into(), + ]); + + insert_data_to_model(&items, data_model, false); + } + app.set_empty_files_model(items.into()); + app.invoke_scan_ended(format!("Found {items_found} empty files").into()); + app.global::().set_info_text(messages.into()); +} fn scan_empty_folders(a: Weak, progress_sender: Sender, stop_receiver: Receiver<()>, custom_settings: SettingsCustom) { thread::Builder::new() @@ -152,34 +154,34 @@ fn scan_empty_folders(a: Weak, progress_sender: Sender set_common_settings(&mut finder, &custom_settings); finder.find_empty_folders(Some(&stop_receiver), Some(&progress_sender)); - let mut vector = finder.get_empty_folder_list().keys().cloned().collect::>(); + let mut vector = finder.get_empty_folder_list().values().cloned().collect::>(); let messages = finder.get_text_messages().create_messages_text(); - vector.sort_unstable_by_key(|e| { - let t = split_path(e.as_path()); - (t.0, t.1) - }); + vector.sort_unstable_by(|a, b| split_path_compare(a.path.as_path(), b.path.as_path())); a.upgrade_in_event_loop(move |app| { - let folder_map = finder.get_empty_folder_list(); - let items = Rc::new(VecModel::default()); - for path in vector { - let (directory, file) = split_path(&path); - let data_model = VecModel::from_slice(&[ - file.into(), - directory.into(), - NaiveDateTime::from_timestamp_opt(folder_map[&path].modified_date as i64, 0).unwrap().to_string().into(), - ]); - - insert_data_to_model(&items, data_model, false); - } - app.set_empty_folder_model(items.into()); - app.invoke_scan_ended(format!("Found {} empty folders", folder_map.len()).into()); - app.global::().set_info_text(messages.into()); + write_empty_folders_results(&app, vector, messages); }) }) .unwrap(); } +fn write_empty_folders_results(app: &MainWindow, vector: Vec, messages: String) { + let items_found = vector.len(); + let items = Rc::new(VecModel::default()); + for fe in vector { + let (directory, file) = split_path(&fe.path); + let data_model = VecModel::from_slice(&[ + file.into(), + directory.into(), + NaiveDateTime::from_timestamp_opt(fe.modified_date as i64, 0).unwrap().to_string().into(), + ]); + + insert_data_to_model(&items, data_model, false); + } + app.set_empty_folder_model(items.into()); + app.invoke_scan_ended(format!("Found {items_found} empty folders").into()); + app.global::().set_info_text(messages.into()); +} fn insert_data_to_model(items: &Rc>, data_model: ModelRc, header_row: bool) { let main = MainListModel {