1
0
Fork 0
mirror of synced 2024-04-29 10:03:00 +12:00

Add cache support to similar music files (#558)

* Simplify cache code

* Better saving/loading.
Add support for loading/saving json files in release mode

* Broken files cache

* Finally same music cache
This commit is contained in:
Rafał Mikrut 2022-01-05 22:47:27 +01:00 committed by GitHub
parent db3b1f5304
commit aaa5885326
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
17 changed files with 527 additions and 426 deletions

View file

@ -1,8 +1,8 @@
## Version 4.0.0 - ?
- Multithreading support for collecting files to check(2/3x speedup on 4 thread processor and SSD) - [#502](https://github.com/qarmin/czkawka/pull/502), [#504](https://github.com/qarmin/czkawka/pull/504)
- Add Polish, German and Italian translation - [#469](https://github.com/qarmin/czkawka/pull/469), [#508](https://github.com/qarmin/czkawka/pull/508), [5be](https://github.com/qarmin/czkawka/commit/5be801e76395855f07ab1da43cdbb8bd0b843834)
- Add multiple translations - Polish, Italian, French, German, Russian ... - [#469](https://github.com/qarmin/czkawka/pull/469), [#508](https://github.com/qarmin/czkawka/pull/508), [5be](https://github.com/qarmin/czkawka/commit/5be801e76395855f07ab1da43cdbb8bd0b843834)
- Add support for finding similar videos - [#460](https://github.com/qarmin/czkawka/pull/460)
- GUI code refactoring(could fix some bugs) - [#462](https://github.com/qarmin/czkawka/pull/462)
- GUI code refactoring and search code unification - [#462](https://github.com/qarmin/czkawka/pull/462), [#531](https://github.com/qarmin/czkawka/pull/531)
- Fixed crash when trying to hard/symlink 0 files - [#462](https://github.com/qarmin/czkawka/pull/462)
- GTK 4 compatibility improvements for future change of toolkit - [#467](https://github.com/qarmin/czkawka/pull/467), [#468](https://github.com/qarmin/czkawka/pull/468), [#473](https://github.com/qarmin/czkawka/pull/473), [#474](https://github.com/qarmin/czkawka/pull/474), [#503](https://github.com/qarmin/czkawka/pull/503), [#505](https://github.com/qarmin/czkawka/pull/505)
- Change minimal supported OS to Ubuntu 20.04(needed by GTK) - [#468](https://github.com/qarmin/czkawka/pull/468)
@ -22,6 +22,7 @@
- Image compare performance and usability improvements - [#529](https://github.com/qarmin/czkawka/pull/529), [#528](https://github.com/qarmin/czkawka/pull/528), [#530](https://github.com/qarmin/czkawka/pull/530), [#525](https://github.com/qarmin/czkawka/pull/525)
- Reorganize(unify) saving/loading data from file - [#524](https://github.com/qarmin/czkawka/pull/524)
- Add "reference folders" - [#516](https://github.com/qarmin/czkawka/pull/516)
- Add cache for similar music files - [#558](https://github.com/qarmin/czkawka/pull/558)
## Version 3.3.1 - 22.11.2021r
- Fix crash when moving buttons [#457](https://github.com/qarmin/czkawka/pull/457)

View file

@ -1,5 +1,5 @@
use std::collections::BTreeMap;
use std::fs::{File, Metadata, OpenOptions};
use std::fs::{File, Metadata};
use std::io::prelude::*;
use std::io::{BufReader, BufWriter};
use std::path::{Path, PathBuf};
@ -10,10 +10,10 @@ use std::time::{Duration, SystemTime, UNIX_EPOCH};
use std::{fs, mem, panic, thread};
use crossbeam_channel::Receiver;
use directories_next::ProjectDirs;
use rayon::prelude::*;
use serde::{Deserialize, Serialize};
use crate::common::Common;
use crate::common::{open_cache_folder, Common};
use crate::common_directory::Directories;
use crate::common_extensions::Extensions;
use crate::common_items::ExcludedItems;
@ -23,8 +23,6 @@ use crate::fl;
use crate::localizer::generate_translation_hashmap;
use crate::similar_images::{AUDIO_FILES_EXTENSIONS, IMAGE_RS_BROKEN_FILES_EXTENSIONS, ZIP_FILES_EXTENSIONS};
const CACHE_FILE_NAME: &str = "cache_broken_files.txt";
#[derive(Debug)]
pub struct ProgressData {
pub current_stage: u8,
@ -39,7 +37,7 @@ pub enum DeleteMethod {
Delete,
}
#[derive(Clone)]
#[derive(Clone, Serialize, Deserialize)]
pub struct FileEntry {
pub path: PathBuf,
pub modified_date: u64,
@ -48,7 +46,7 @@ pub struct FileEntry {
pub error_string: String,
}
#[derive(Copy, Clone, PartialEq, Eq)]
#[derive(Copy, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum TypeOfFile {
Unknown = -1,
Image = 0,
@ -82,6 +80,8 @@ pub struct BrokenFiles {
delete_method: DeleteMethod,
stopped_search: bool,
use_cache: bool,
delete_outdated_cache: bool, // TODO add this to GUI
save_also_as_json: bool,
}
impl BrokenFiles {
@ -98,6 +98,8 @@ impl BrokenFiles {
stopped_search: false,
broken_files: Default::default(),
use_cache: true,
delete_outdated_cache: true,
save_also_as_json: false,
}
}
@ -135,6 +137,10 @@ impl BrokenFiles {
self.delete_method = delete_method;
}
pub fn set_save_also_as_json(&mut self, save_also_as_json: bool) {
self.save_also_as_json = save_also_as_json;
}
pub fn set_use_cache(&mut self, use_cache: bool) {
self.use_cache = use_cache;
}
@ -350,7 +356,7 @@ impl BrokenFiles {
let mut non_cached_files_to_check: BTreeMap<String, FileEntry> = Default::default();
if self.use_cache {
loaded_hash_map = match load_cache_from_file(&mut self.text_messages) {
loaded_hash_map = match load_cache_from_file(&mut self.text_messages, self.delete_outdated_cache) {
Some(t) => t,
None => Default::default(),
};
@ -501,7 +507,7 @@ impl BrokenFiles {
for (_name, file_entry) in loaded_hash_map {
all_results.insert(file_entry.path.to_string_lossy().to_string(), file_entry);
}
save_cache_to_file(&all_results, &mut self.text_messages);
save_cache_to_file(&all_results, &mut self.text_messages, self.save_also_as_json);
}
self.information.number_of_broken_files = self.broken_files.len();
@ -620,137 +626,84 @@ impl PrintResults for BrokenFiles {
}
}
fn save_cache_to_file(hashmap_file_entry: &BTreeMap<String, FileEntry>, text_messages: &mut Messages) {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
// Lin: /home/username/.cache/czkawka
// Win: C:\Users\Username\AppData\Local\Qarmin\Czkawka\cache
// Mac: /Users/Username/Library/Caches/pl.Qarmin.Czkawka
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
if cache_dir.exists() {
if !cache_dir.is_dir() {
text_messages.messages.push(format!("Config dir {} is a file!", cache_dir.display()));
return;
}
} else if let Err(e) = fs::create_dir_all(&cache_dir) {
text_messages.messages.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e));
return;
fn save_cache_to_file(old_hashmap: &BTreeMap<String, FileEntry>, text_messages: &mut Messages, save_also_as_json: bool) {
let mut hashmap: BTreeMap<String, FileEntry> = Default::default();
for (path, fe) in old_hashmap {
if fe.size > 1024 {
hashmap.insert(path.clone(), fe.clone());
}
let cache_file = cache_dir.join(CACHE_FILE_NAME);
let file_handler = match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) {
Ok(t) => t,
Err(e) => {
}
let hashmap = &hashmap;
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), true, save_also_as_json, &mut text_messages.warnings) {
{
let writer = BufWriter::new(file_handler.unwrap()); // Unwrap because cannot fail here
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e));
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
};
let mut writer = BufWriter::new(file_handler);
for file_entry in hashmap_file_entry.values() {
// Only save to cache files which have more than 1KB
if file_entry.size > 1024 {
let string: String = format!(
"{}//{}//{}//{}",
file_entry.path.display(),
file_entry.size,
file_entry.modified_date,
file_entry.error_string
);
if let Err(e) = writeln!(writer, "{}", string) {
}
if save_also_as_json {
if let Some(file_handler_json) = file_handler_json {
let writer = BufWriter::new(file_handler_json);
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.messages
.push(format!("Failed to save some data to cache file {}, reason {}", cache_file.display(), e));
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file_json.display(), e));
return;
};
}
}
}
text_messages.messages.push(format!("Properly saved to file {} cache entries.", hashmap.len()));
}
}
fn load_cache_from_file(text_messages: &mut Messages) -> Option<BTreeMap<String, FileEntry>> {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
let cache_file = cache_dir.join(CACHE_FILE_NAME);
// TODO add before checking if cache exists(if not just return) but if exists then enable error
let file_handler = match OpenOptions::new().read(true).open(&cache_file) {
Ok(t) => t,
Err(_inspected) => {
// text_messages.messages.push(format!("Cannot find or open cache file {}", cache_file.display())); // This shouldn't be write to output
return None;
}
};
let reader = BufReader::new(file_handler);
let mut hashmap_loaded_entries: BTreeMap<String, FileEntry> = Default::default();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (index, line) in reader.lines().enumerate() {
let line = match line {
fn load_cache_from_file(text_messages: &mut Messages, delete_outdated_cache: bool) -> Option<BTreeMap<String, FileEntry>> {
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), false, true, &mut text_messages.warnings) {
let mut hashmap_loaded_entries: BTreeMap<String, FileEntry>;
if let Some(file_handler) = file_handler {
let reader = BufReader::new(file_handler);
hashmap_loaded_entries = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load line number {} from cache file {}, reason {}", index + 1, cache_file.display(), e));
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
} else {
let reader = BufReader::new(file_handler_json.unwrap()); // Unwrap cannot fail, because at least one file must be valid
hashmap_loaded_entries = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file_json.display(), e));
return None;
}
};
let uuu = line.split("//").collect::<Vec<&str>>();
if uuu.len() != 4 {
text_messages
.warnings
.push(format!("Found invalid data in line {} - ({}) in cache file {}", index + 1, line, cache_file.display()));
continue;
}
// Don't load cache data if destination file not exists
if Path::new(uuu[0]).exists() {
hashmap_loaded_entries.insert(
uuu[0].to_string(),
FileEntry {
path: PathBuf::from(uuu[0]),
size: match uuu[1].parse::<u64>() {
Ok(t) => t,
Err(e) => {
text_messages.warnings.push(format!(
"Found invalid size value in line {} - ({}) in cache file {}, reason {}",
index + 1,
line,
cache_file.display(),
e
));
continue;
}
},
modified_date: match uuu[2].parse::<u64>() {
Ok(t) => t,
Err(e) => {
text_messages.warnings.push(format!(
"Found invalid modified date value in line {} - ({}) in cache file {}, reason {}",
index + 1,
line,
cache_file.display(),
e
));
continue;
}
},
type_of_file: check_extension_avaibility(&uuu[0].to_lowercase()),
error_string: uuu[3].to_string(),
},
);
}
}
// Don't load cache data if destination file not exists
if delete_outdated_cache {
hashmap_loaded_entries.retain(|src_path, _file_entry| Path::new(src_path).exists());
}
text_messages.messages.push(format!("Properly loaded {} cache entries.", hashmap_loaded_entries.len()));
return Some(hashmap_loaded_entries);
}
text_messages.messages.push("Cannot find or open system config dir to save cache file".to_string());
None
}
fn get_cache_file() -> String {
"cache_broken_files.bin".to_string()
}
fn check_extension_avaibility(file_name_lowercase: &str) -> TypeOfFile {
if IMAGE_RS_BROKEN_FILES_EXTENSIONS.iter().any(|e| file_name_lowercase.ends_with(e)) {
TypeOfFile::Image

View file

@ -1,8 +1,9 @@
use directories_next::ProjectDirs;
use image::{DynamicImage, ImageBuffer, Rgb};
use imagepipe::{ImageSource, Pipeline};
use std::ffi::OsString;
use std::fs;
use std::fs::OpenOptions;
use std::fs::{File, OpenOptions};
use std::io::BufReader;
use std::path::{Path, PathBuf};
use std::time::SystemTime;
@ -11,6 +12,62 @@ use std::time::SystemTime;
pub struct Common();
pub fn open_cache_folder(cache_file_name: &str, save_to_cache: bool, use_json: bool, warnings: &mut Vec<String>) -> Option<((Option<File>, PathBuf), (Option<File>, PathBuf))> {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
let cache_file = cache_dir.join(cache_file_name);
let cache_file_json = cache_dir.join(cache_file_name.replace(".bin", ".json"));
let mut file_handler_default = None;
let mut file_handler_json = None;
if save_to_cache {
if cache_dir.exists() {
if !cache_dir.is_dir() {
warnings.push(format!("Config dir {} is a file!", cache_dir.display()));
return None;
}
} else if let Err(e) = fs::create_dir_all(&cache_dir) {
warnings.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e));
return None;
}
file_handler_default = Some(match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) {
Ok(t) => t,
Err(e) => {
warnings.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e));
return None;
}
});
if use_json {
file_handler_json = Some(match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file_json) {
Ok(t) => t,
Err(e) => {
warnings.push(format!("Cannot create or open cache file {}, reason {}", cache_file_json.display(), e));
return None;
}
});
}
} else {
if let Ok(t) = OpenOptions::new().read(true).open(&cache_file) {
file_handler_default = Some(t);
} else {
if use_json {
file_handler_json = Some(match OpenOptions::new().read(true).open(&cache_file_json) {
Ok(t) => t,
Err(_) => return None,
});
} else {
// messages.push(format!("Cannot find or open cache file {}", cache_file.display())); // No error or warning
return None;
}
}
};
return Some(((file_handler_default, cache_file), (file_handler_json, cache_file_json)));
}
None
}
pub fn get_dynamic_image_from_raw_image(path: impl AsRef<Path> + std::fmt::Debug) -> Option<DynamicImage> {
let file_handler = match OpenOptions::new().read(true).open(&path) {
Ok(t) => t,

View file

@ -1,7 +1,7 @@
use std::collections::BTreeMap;
#[cfg(target_family = "unix")]
use std::collections::HashSet;
use std::fs::{File, OpenOptions};
use std::fs::File;
use std::hash::Hasher;
use std::io::prelude::*;
use std::io::{self, Error, ErrorKind};
@ -16,11 +16,10 @@ use std::time::{Duration, SystemTime};
use std::{fs, mem, thread};
use crossbeam_channel::Receiver;
use directories_next::ProjectDirs;
use humansize::{file_size_opts as options, FileSize};
use rayon::prelude::*;
use crate::common::Common;
use crate::common::{open_cache_folder, Common};
use crate::common_dir_traversal::{CheckingMethod, DirTraversalBuilder, DirTraversalResult, FileEntry, ProgressData};
use crate::common_directory::Directories;
use crate::common_extensions::Extensions;
@ -1262,29 +1261,11 @@ pub fn make_hard_link(src: &Path, dst: &Path) -> io::Result<()> {
}
pub fn save_hashes_to_file(hashmap: &BTreeMap<String, FileEntry>, text_messages: &mut Messages, type_of_hash: &HashType, is_prehash: bool, minimal_cache_file_size: u64) {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
if cache_dir.exists() {
if !cache_dir.is_dir() {
text_messages.messages.push(format!("Config dir {} is a file!", cache_dir.display()));
return;
}
} else if let Err(e) = fs::create_dir_all(&cache_dir) {
text_messages.messages.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e));
return;
}
let cache_file = cache_dir.join(get_file_hash_name(type_of_hash, is_prehash).as_str());
let file_handler = match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) {
Ok(t) => t,
Err(e) => {
text_messages
.messages
.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e));
return;
}
};
let mut writer = BufWriter::new(file_handler);
if let Some(((file_handler, cache_file), (_json_file, _json_name))) = open_cache_folder(&get_file_hash_name(type_of_hash, is_prehash), true, false, &mut text_messages.warnings)
{
let mut writer = BufWriter::new(file_handler.unwrap()); // Unwrap cannot fail
let mut how_much = 0;
for file_entry in hashmap.values() {
// Only cache bigger than 5MB files
if file_entry.size >= minimal_cache_file_size {
@ -1292,60 +1273,27 @@ pub fn save_hashes_to_file(hashmap: &BTreeMap<String, FileEntry>, text_messages:
if let Err(e) = writeln!(writer, "{}", string) {
text_messages
.messages
.warnings
.push(format!("Failed to save some data to cache file {}, reason {}", cache_file.display(), e));
return;
};
} else {
how_much += 1;
}
}
}
text_messages.messages.push(format!("Properly saved to file {} cache entries.", how_much));
}
}
pub trait MyHasher {
fn update(&mut self, bytes: &[u8]);
fn finalize(&self) -> String;
}
fn hash_calculation(buffer: &mut [u8], file_entry: &FileEntry, hash_type: &HashType, limit: u64) -> Result<String, String> {
let mut file_handler = match File::open(&file_entry.path) {
Ok(t) => t,
Err(e) => return Err(format!("Unable to check hash of file {}, reason {}", file_entry.path.display(), e)),
};
let hasher = &mut *hash_type.hasher();
let mut current_file_read_bytes: u64 = 0;
loop {
let n = match file_handler.read(buffer) {
Ok(0) => break,
Ok(t) => t,
Err(e) => return Err(format!("Error happened when checking hash of file {}, reason {}", file_entry.path.display(), e)),
};
current_file_read_bytes += n as u64;
hasher.update(&buffer[..n]);
if current_file_read_bytes >= limit {
break;
}
}
Ok(hasher.finalize())
}
fn get_file_hash_name(type_of_hash: &HashType, is_prehash: bool) -> String {
let prehash_str = if is_prehash { "_prehash" } else { "" };
format!("cache_duplicates_{:?}{}.txt", type_of_hash, prehash_str)
}
pub fn load_hashes_from_file(text_messages: &mut Messages, delete_outdated_cache: bool, type_of_hash: &HashType, is_prehash: bool) -> Option<BTreeMap<u64, Vec<FileEntry>>> {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
let cache_file = cache_dir.join(get_file_hash_name(type_of_hash, is_prehash).as_str());
let file_handler = match OpenOptions::new().read(true).open(&cache_file) {
Ok(t) => t,
Err(_inspected) => {
return None;
}
if let Some(((file_handler, cache_file), (_json_file, _json_name))) =
open_cache_folder(&get_file_hash_name(type_of_hash, is_prehash), false, false, &mut text_messages.warnings)
{
// Unwrap could fail when failed to open cache file, but json would exists
let file_handler = match file_handler {
Some(t) => t,
_ => return Default::default(),
};
let reader = BufReader::new(file_handler);
let mut hashmap_loaded_entries: BTreeMap<u64, Vec<FileEntry>> = Default::default();
@ -1409,13 +1357,47 @@ pub fn load_hashes_from_file(text_messages: &mut Messages, delete_outdated_cache
}
}
text_messages.messages.push(format!("Properly loaded {} cache entries.", hashmap_loaded_entries.len()));
return Some(hashmap_loaded_entries);
}
text_messages.messages.push("Cannot find or open system config dir to save cache file".to_string());
None
}
pub trait MyHasher {
fn update(&mut self, bytes: &[u8]);
fn finalize(&self) -> String;
}
fn hash_calculation(buffer: &mut [u8], file_entry: &FileEntry, hash_type: &HashType, limit: u64) -> Result<String, String> {
let mut file_handler = match File::open(&file_entry.path) {
Ok(t) => t,
Err(e) => return Err(format!("Unable to check hash of file {}, reason {}", file_entry.path.display(), e)),
};
let hasher = &mut *hash_type.hasher();
let mut current_file_read_bytes: u64 = 0;
loop {
let n = match file_handler.read(buffer) {
Ok(0) => break,
Ok(t) => t,
Err(e) => return Err(format!("Error happened when checking hash of file {}, reason {}", file_entry.path.display(), e)),
};
current_file_read_bytes += n as u64;
hasher.update(&buffer[..n]);
if current_file_read_bytes >= limit {
break;
}
}
Ok(hasher.finalize())
}
fn get_file_hash_name(type_of_hash: &HashType, is_prehash: bool) -> String {
let prehash_str = if is_prehash { "_prehash" } else { "" };
format!("cache_duplicates_{:?}{}.txt", type_of_hash, prehash_str)
}
impl MyHasher for blake3::Hasher {
fn update(&mut self, bytes: &[u8]) {
self.update(bytes);

View file

@ -1,3 +1,6 @@
#![allow(clippy::collapsible_else_if)]
#![allow(clippy::type_complexity)]
#[macro_use]
extern crate bitflags;

View file

@ -1,8 +1,8 @@
use std::collections::BTreeMap;
use std::collections::{BTreeMap, HashMap};
use std::fs::File;
use std::io::prelude::*;
use std::io::BufWriter;
use std::path::PathBuf;
use std::io::{BufReader, BufWriter};
use std::path::{Path, PathBuf};
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::Arc;
use std::thread::sleep;
@ -12,8 +12,9 @@ use std::{mem, thread};
use audiotags::Tag;
use crossbeam_channel::Receiver;
use rayon::prelude::*;
use serde::{Deserialize, Serialize};
use crate::common::Common;
use crate::common::{open_cache_folder, Common};
use crate::common_dir_traversal::{CheckingMethod, DirTraversalBuilder, DirTraversalResult, FileEntry, ProgressData};
use crate::common_directory::Directories;
use crate::common_extensions::Extensions;
@ -43,7 +44,7 @@ bitflags! {
}
}
#[derive(Clone, Debug)]
#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct MusicEntry {
pub size: u64,
@ -61,10 +62,10 @@ pub struct MusicEntry {
}
impl FileEntry {
fn into_music_entry(self) -> MusicEntry {
fn to_music_entry(&self) -> MusicEntry {
MusicEntry {
size: self.size,
path: self.path,
path: self.path.clone(),
modified_date: self.modified_date,
title: "".to_string(),
@ -94,7 +95,7 @@ impl Info {
pub struct SameMusic {
text_messages: Messages,
information: Info,
music_to_check: Vec<FileEntry>,
music_to_check: HashMap<String, MusicEntry>,
music_entries: Vec<MusicEntry>,
duplicated_music_entries: Vec<Vec<MusicEntry>>,
duplicated_music_entries_referenced: Vec<(MusicEntry, Vec<MusicEntry>)>,
@ -108,7 +109,10 @@ pub struct SameMusic {
music_similarity: MusicSimilarity,
stopped_search: bool,
approximate_comparison: bool,
use_cache: bool,
delete_outdated_cache: bool, // TODO add this to GUI
use_reference_folders: bool,
save_also_as_json: bool,
}
impl SameMusic {
@ -127,10 +131,13 @@ impl SameMusic {
minimal_file_size: 8192,
maximal_file_size: u64::MAX,
duplicated_music_entries: vec![],
music_to_check: Vec::with_capacity(2048),
music_to_check: Default::default(),
approximate_comparison: true,
use_cache: true,
delete_outdated_cache: true,
use_reference_folders: false,
duplicated_music_entries_referenced: vec![],
save_also_as_json: false,
}
}
@ -176,6 +183,14 @@ impl SameMusic {
self.delete_method = delete_method;
}
pub fn set_save_also_as_json(&mut self, save_also_as_json: bool) {
self.save_also_as_json = save_also_as_json;
}
pub fn set_use_cache(&mut self, use_cache: bool) {
self.use_cache = use_cache;
}
pub fn set_approximate_comparison(&mut self, approximate_comparison: bool) {
self.approximate_comparison = approximate_comparison;
}
@ -257,7 +272,9 @@ impl SameMusic {
warnings,
} => {
if let Some(music_to_check) = grouped_file_entries.get(&()) {
self.music_to_check = music_to_check.clone();
for fe in music_to_check {
self.music_to_check.insert(fe.path.to_string_lossy().to_string(), fe.to_music_entry());
}
}
self.text_messages.warnings.extend(warnings);
Common::print_time(start_time, SystemTime::now(), "check_files".to_string());
@ -273,6 +290,35 @@ impl SameMusic {
fn check_records_multithreaded(&mut self, stop_receiver: Option<&Receiver<()>>, progress_sender: Option<&futures::channel::mpsc::UnboundedSender<ProgressData>>) -> bool {
let start_time: SystemTime = SystemTime::now();
let loaded_hash_map;
let mut records_already_cached: HashMap<String, MusicEntry> = Default::default();
let mut non_cached_files_to_check: HashMap<String, MusicEntry> = Default::default();
if self.use_cache {
loaded_hash_map = match load_cache_from_file(&mut self.text_messages, self.delete_outdated_cache) {
Some(t) => t,
None => Default::default(),
};
for (name, file_entry) in &self.music_to_check {
#[allow(clippy::if_same_then_else)]
if !loaded_hash_map.contains_key(name) {
// If loaded data doesn't contains current image info
non_cached_files_to_check.insert(name.clone(), file_entry.clone());
} else if file_entry.size != loaded_hash_map.get(name).unwrap().size || file_entry.modified_date != loaded_hash_map.get(name).unwrap().modified_date {
// When size or modification date of image changed, then it is clear that is different image
non_cached_files_to_check.insert(name.clone(), file_entry.clone());
} else {
// Checking may be omitted when already there is entry with same size and modification date
records_already_cached.insert(name.clone(), loaded_hash_map.get(name).unwrap().clone());
}
}
} else {
loaded_hash_map = Default::default();
mem::swap(&mut self.music_to_check, &mut non_cached_files_to_check);
}
let check_was_breaked = AtomicBool::new(false); // Used for breaking from GUI and ending check thread
//// PROGRESS THREAD START
@ -285,7 +331,7 @@ impl SameMusic {
let progress_send = progress_sender.clone();
let progress_thread_run = progress_thread_run.clone();
let atomic_file_counter = atomic_file_counter.clone();
let music_to_check = self.music_to_check.len();
let music_to_check = non_cached_files_to_check.len();
thread::spawn(move || loop {
progress_send
.unbounded_send(ProgressData {
@ -307,46 +353,43 @@ impl SameMusic {
//// PROGRESS THREAD END
// Clean for duplicate files
let music_to_check = mem::take(&mut self.music_to_check);
let vec_file_entry = music_to_check
let mut vec_file_entry = non_cached_files_to_check
.into_par_iter()
.map(|file_entry| {
.map(|(path, mut music_entry)| {
atomic_file_counter.fetch_add(1, Ordering::Relaxed);
if stop_receiver.is_some() && stop_receiver.unwrap().try_recv().is_ok() {
check_was_breaked.store(true, Ordering::Relaxed);
return None;
}
let mut file_entry = file_entry.into_music_entry();
let tag = match Tag::new().read_from_path(&file_entry.path) {
let tag = match Tag::new().read_from_path(&path) {
Ok(t) => t,
Err(_inspected) => return Some(None), // Data not in utf-8, etc., TODO this should be probably added to warnings, errors
};
file_entry.title = match tag.title() {
music_entry.title = match tag.title() {
Some(t) => t.to_string(),
None => "".to_string(),
};
file_entry.artist = match tag.artist() {
music_entry.artist = match tag.artist() {
Some(t) => t.to_string(),
None => "".to_string(),
};
file_entry.album_title = match tag.album_title() {
music_entry.album_title = match tag.album_title() {
Some(t) => t.to_string(),
None => "".to_string(),
};
file_entry.album_artist = match tag.album_artist() {
music_entry.album_artist = match tag.album_artist() {
Some(t) => t.to_string(),
None => "".to_string(),
};
file_entry.year = tag.year().unwrap_or(0);
music_entry.year = tag.year().unwrap_or(0);
Some(Some(file_entry))
Some(Some(music_entry))
})
.while_some()
.filter(|file_entry| file_entry.is_some())
.map(|file_entry| file_entry.unwrap())
.filter(|music_entry| music_entry.is_some())
.map(|music_entry| music_entry.unwrap())
.collect::<Vec<_>>();
// End thread which send info to gui
@ -358,8 +401,22 @@ impl SameMusic {
return false;
}
// Adding files to Vector
self.music_entries = vec_file_entry;
// Just connect loaded results with already calculated
for (_name, file_entry) in records_already_cached {
vec_file_entry.push(file_entry.clone());
}
self.music_entries = vec_file_entry.clone();
if self.use_cache {
// Must save all results to file, old loaded from file with all currently counted results
let mut all_results: HashMap<String, MusicEntry> = loaded_hash_map;
for file_entry in vec_file_entry {
all_results.insert(file_entry.path.to_string_lossy().to_string(), file_entry);
}
save_cache_to_file(&all_results, &mut self.text_messages, self.save_also_as_json);
}
Common::print_time(start_time, SystemTime::now(), "check_records_multithreaded".to_string());
@ -632,6 +689,76 @@ impl SameMusic {
}
}
fn save_cache_to_file(hashmap: &HashMap<String, MusicEntry>, text_messages: &mut Messages, save_also_as_json: bool) {
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), true, save_also_as_json, &mut text_messages.warnings) {
{
let writer = BufWriter::new(file_handler.unwrap()); // Unwrap because cannot fail here
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
}
if save_also_as_json {
if let Some(file_handler_json) = file_handler_json {
let writer = BufWriter::new(file_handler_json);
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file_json.display(), e));
return;
}
}
}
text_messages.messages.push(format!("Properly saved to file {} cache entries.", hashmap.len()));
}
}
fn load_cache_from_file(text_messages: &mut Messages, delete_outdated_cache: bool) -> Option<HashMap<String, MusicEntry>> {
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), false, true, &mut text_messages.warnings) {
let mut hashmap_loaded_entries: HashMap<String, MusicEntry>;
if let Some(file_handler) = file_handler {
let reader = BufReader::new(file_handler);
hashmap_loaded_entries = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
} else {
let reader = BufReader::new(file_handler_json.unwrap()); // Unwrap cannot fail, because at least one file must be valid
hashmap_loaded_entries = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file_json.display(), e));
return None;
}
};
}
// Don't load cache data if destination file not exists
if delete_outdated_cache {
hashmap_loaded_entries.retain(|src_path, _file_entry| Path::new(src_path).exists());
}
text_messages.messages.push(format!("Properly loaded {} cache entries.", hashmap_loaded_entries.len()));
return Some(hashmap_loaded_entries);
}
None
}
fn get_cache_file() -> String {
"cache_same_music.bin".to_string()
}
impl Default for SameMusic {
fn default() -> Self {
Self::new()

View file

@ -1,5 +1,4 @@
use std::collections::{BTreeSet, HashMap, HashSet};
use std::fs::OpenOptions;
use std::fs::{File, Metadata};
use std::io::Write;
use std::io::*;
@ -13,14 +12,13 @@ use std::{fs, mem, thread};
use bk_tree::BKTree;
use crossbeam_channel::Receiver;
use directories_next::ProjectDirs;
use humansize::{file_size_opts as options, FileSize};
use image::GenericImageView;
use img_hash::{FilterType, HashAlg, HasherConfig};
use rayon::prelude::*;
use serde::{Deserialize, Serialize};
use crate::common::{get_dynamic_image_from_raw_image, Common};
use crate::common::{get_dynamic_image_from_raw_image, open_cache_folder, Common};
use crate::common_directory::Directories;
use crate::common_extensions::Extensions;
use crate::common_items::ExcludedItems;
@ -130,6 +128,7 @@ pub struct SimilarImages {
exclude_images_with_same_size: bool,
use_reference_folders: bool,
fast_comparing: bool,
save_also_as_json: bool,
}
/// Info struck with helpful information's about results
@ -173,6 +172,7 @@ impl SimilarImages {
exclude_images_with_same_size: false,
use_reference_folders: false,
fast_comparing: false,
save_also_as_json: false,
}
}
@ -204,6 +204,9 @@ impl SimilarImages {
pub fn set_fast_comparing(&mut self, fast_comparing: bool) {
self.fast_comparing = fast_comparing;
}
pub fn set_save_also_as_json(&mut self, save_also_as_json: bool) {
self.save_also_as_json = save_also_as_json;
}
pub fn get_stopped_search(&self) -> bool {
self.stopped_search
@ -644,7 +647,14 @@ impl SimilarImages {
for (file_entry, _hash) in vec_file_entry {
all_results.insert(file_entry.path.to_string_lossy().to_string(), file_entry);
}
save_hashes_to_file(&all_results, &mut self.text_messages, self.hash_size, self.hash_alg, self.image_filter);
save_hashes_to_file(
&all_results,
&mut self.text_messages,
self.save_also_as_json,
self.hash_size,
self.hash_alg,
self.image_filter,
);
}
Common::print_time(hash_map_modification, SystemTime::now(), "sort_images - saving data to files".to_string());
@ -1033,44 +1043,36 @@ impl PrintResults for SimilarImages {
}
}
pub fn save_hashes_to_file(hashmap: &HashMap<String, FileEntry>, text_messages: &mut Messages, hash_size: u8, hash_alg: HashAlg, image_filter: FilterType) {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
if cache_dir.exists() {
if !cache_dir.is_dir() {
text_messages.messages.push(format!("Config dir {} is a file!", cache_dir.display()));
return;
}
} else if let Err(e) = fs::create_dir_all(&cache_dir) {
text_messages.messages.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e));
return;
}
let cache_file = cache_dir.join(cache_dir.join(get_cache_file(&hash_size, &hash_alg, &image_filter)));
let file_handler = match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) {
Ok(t) => t,
Err(e) => {
pub fn save_hashes_to_file(
hashmap: &HashMap<String, FileEntry>,
text_messages: &mut Messages,
save_also_as_json: bool,
hash_size: u8,
hash_alg: HashAlg,
image_filter: FilterType,
) {
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) =
open_cache_folder(&get_cache_file(&hash_size, &hash_alg, &image_filter), true, save_also_as_json, &mut text_messages.warnings)
{
{
let writer = BufWriter::new(file_handler.unwrap()); // Unwrap because cannot fail here
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e));
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
};
let writer = BufWriter::new(file_handler);
#[cfg(not(debug_assertions))]
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
#[cfg(debug_assertions)]
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
if save_also_as_json {
if let Some(file_handler_json) = file_handler_json {
let writer = BufWriter::new(file_handler_json);
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file_json.display(), e));
return;
}
}
}
text_messages.messages.push(format!("Properly saved to file {} cache entries.", hashmap.len()));
@ -1084,38 +1086,33 @@ pub fn load_hashes_from_file(
hash_alg: HashAlg,
image_filter: FilterType,
) -> Option<HashMap<String, FileEntry>> {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
let cache_file = cache_dir.join(get_cache_file(&hash_size, &hash_alg, &image_filter));
let file_handler = match OpenOptions::new().read(true).open(&cache_file) {
Ok(t) => t,
Err(_inspected) => {
// text_messages.messages.push(format!("Cannot find or open cache file {}", cache_file.display())); // No error warning
return None;
}
};
let reader = BufReader::new(file_handler);
#[cfg(debug_assertions)]
let mut hashmap_loaded_entries: HashMap<String, FileEntry> = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
#[cfg(not(debug_assertions))]
let mut hashmap_loaded_entries: HashMap<String, FileEntry> = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) =
open_cache_folder(&get_cache_file(&hash_size, &hash_alg, &image_filter), false, true, &mut text_messages.warnings)
{
let mut hashmap_loaded_entries: HashMap<String, FileEntry>;
if let Some(file_handler) = file_handler {
let reader = BufReader::new(file_handler);
hashmap_loaded_entries = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
} else {
let reader = BufReader::new(file_handler_json.unwrap()); // Unwrap cannot fail, because at least one file must be valid
hashmap_loaded_entries = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file_json.display(), e));
return None;
}
};
}
// Don't load cache data if destination file not exists
if delete_outdated_cache {
@ -1126,28 +1123,15 @@ pub fn load_hashes_from_file(
return Some(hashmap_loaded_entries);
}
text_messages.messages.push("Cannot find or open system config dir to save cache file".to_string());
None
}
fn get_cache_file(hash_size: &u8, hash_alg: &HashAlg, image_filter: &FilterType) -> String {
let extension;
#[cfg(debug_assertions)]
{
extension = "json";
}
#[cfg(not(debug_assertions))]
{
extension = "bin";
}
format!(
"cache_similar_images_{}_{}_{}.{}",
"cache_similar_images_{}_{}_{}.bin",
hash_size,
convert_algorithm_to_string(hash_alg),
convert_filters_to_string(image_filter),
extension
)
}

View file

@ -1,5 +1,4 @@
use std::collections::{BTreeMap, BTreeSet, HashMap};
use std::fs::OpenOptions;
use std::fs::{File, Metadata};
use std::io::Write;
use std::io::*;
@ -11,7 +10,6 @@ use std::time::{Duration, SystemTime, UNIX_EPOCH};
use std::{fs, mem, thread};
use crossbeam_channel::Receiver;
use directories_next::ProjectDirs;
use ffmpeg_cmdline_utils::FfmpegErrorKind::FfmpegNotFound;
use humansize::{file_size_opts as options, FileSize};
use rayon::prelude::*;
@ -19,7 +17,7 @@ use serde::{Deserialize, Serialize};
use vid_dup_finder_lib::HashCreationErrorKind::DetermineVideo;
use vid_dup_finder_lib::{NormalizedTolerance, VideoHash};
use crate::common::Common;
use crate::common::{open_cache_folder, Common};
use crate::common_directory::Directories;
use crate::common_extensions::Extensions;
use crate::common_items::ExcludedItems;
@ -81,6 +79,7 @@ pub struct SimilarVideos {
delete_outdated_cache: bool,
exclude_videos_with_same_size: bool,
use_reference_folders: bool,
save_also_as_json: bool,
}
/// Info struck with helpful information's about results
@ -119,6 +118,7 @@ impl SimilarVideos {
exclude_videos_with_same_size: false,
use_reference_folders: false,
similar_referenced_vectors: vec![],
save_also_as_json: true,
}
}
@ -134,6 +134,9 @@ impl SimilarVideos {
assert!((0..=MAX_TOLERANCE).contains(&tolerance));
self.tolerance = tolerance
}
pub fn set_save_also_as_json(&mut self, save_also_as_json: bool) {
self.save_also_as_json = save_also_as_json;
}
pub fn get_stopped_search(&self) -> bool {
self.stopped_search
@ -529,7 +532,7 @@ impl SimilarVideos {
for file_entry in vec_file_entry {
all_results.insert(file_entry.path.to_string_lossy().to_string(), file_entry);
}
save_hashes_to_file(&all_results, &mut self.text_messages);
save_hashes_to_file(&all_results, &mut self.text_messages, self.save_also_as_json);
}
Common::print_time(hash_map_modification, SystemTime::now(), "sort_videos - saving data to files".to_string());
@ -705,44 +708,27 @@ impl PrintResults for SimilarVideos {
}
}
pub fn save_hashes_to_file(hashmap: &BTreeMap<String, FileEntry>, text_messages: &mut Messages) {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
if cache_dir.exists() {
if !cache_dir.is_dir() {
text_messages.messages.push(format!("Config dir {} is a file!", cache_dir.display()));
return;
}
} else if let Err(e) = fs::create_dir_all(&cache_dir) {
text_messages.messages.push(format!("Cannot create config dir {}, reason {}", cache_dir.display(), e));
return;
}
let cache_file = cache_dir.join(cache_dir.join(get_cache_file()));
let file_handler = match OpenOptions::new().truncate(true).write(true).create(true).open(&cache_file) {
Ok(t) => t,
Err(e) => {
pub fn save_hashes_to_file(hashmap: &BTreeMap<String, FileEntry>, text_messages: &mut Messages, save_also_as_json: bool) {
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), true, save_also_as_json, &mut text_messages.warnings) {
{
let writer = BufWriter::new(file_handler.unwrap()); // Unwrap because cannot fail here
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot create or open cache file {}, reason {}", cache_file.display(), e));
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
};
let writer = BufWriter::new(file_handler);
#[cfg(not(debug_assertions))]
if let Err(e) = bincode::serialize_into(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
}
#[cfg(debug_assertions)]
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.messages
.push(format!("Cannot write data to cache file {}, reason {}", cache_file.display(), e));
return;
if save_also_as_json {
if let Some(file_handler_json) = file_handler_json {
let writer = BufWriter::new(file_handler_json);
if let Err(e) = serde_json::to_writer(writer, hashmap) {
text_messages
.warnings
.push(format!("Cannot write data to cache file {}, reason {}", cache_file_json.display(), e));
return;
}
}
}
text_messages.messages.push(format!("Properly saved to file {} cache entries.", hashmap.len()));
@ -750,38 +736,31 @@ pub fn save_hashes_to_file(hashmap: &BTreeMap<String, FileEntry>, text_messages:
}
pub fn load_hashes_from_file(text_messages: &mut Messages, delete_outdated_cache: bool) -> Option<BTreeMap<String, FileEntry>> {
if let Some(proj_dirs) = ProjectDirs::from("pl", "Qarmin", "Czkawka") {
let cache_dir = PathBuf::from(proj_dirs.cache_dir());
let cache_file = cache_dir.join(get_cache_file());
let file_handler = match OpenOptions::new().read(true).open(&cache_file) {
Ok(t) => t,
Err(_inspected) => {
// text_messages.messages.push(format!("Cannot find or open cache file {}", cache_file.display())); // No error warning
return None;
}
};
let reader = BufReader::new(file_handler);
#[cfg(debug_assertions)]
let mut hashmap_loaded_entries: BTreeMap<String, FileEntry> = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
#[cfg(not(debug_assertions))]
let mut hashmap_loaded_entries: BTreeMap<String, FileEntry> = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
if let Some(((file_handler, cache_file), (file_handler_json, cache_file_json))) = open_cache_folder(&get_cache_file(), false, true, &mut text_messages.warnings) {
let mut hashmap_loaded_entries: BTreeMap<String, FileEntry>;
if let Some(file_handler) = file_handler {
let reader = BufReader::new(file_handler);
hashmap_loaded_entries = match bincode::deserialize_from(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file.display(), e));
return None;
}
};
} else {
let reader = BufReader::new(file_handler_json.unwrap()); // Unwrap cannot fail, because at least one file must be valid
hashmap_loaded_entries = match serde_json::from_reader(reader) {
Ok(t) => t,
Err(e) => {
text_messages
.warnings
.push(format!("Failed to load data from cache file {}, reason {}", cache_file_json.display(), e));
return None;
}
};
}
// Don't load cache data if destination file not exists
if delete_outdated_cache {
@ -792,23 +771,11 @@ pub fn load_hashes_from_file(text_messages: &mut Messages, delete_outdated_cache
return Some(hashmap_loaded_entries);
}
text_messages.messages.push("Cannot find or open system config dir to save cache file.".to_string());
None
}
fn get_cache_file() -> String {
let extension;
#[cfg(debug_assertions)]
{
extension = "json";
}
#[cfg(not(debug_assertions))]
{
extension = "bin";
}
format!("cache_similar_videos.{}", extension)
"cache_similar_videos.bin".to_string()
}
pub fn check_if_ffmpeg_is_installed() -> bool {

View file

@ -103,6 +103,7 @@ pub fn connect_button_search(
let button_app_info = gui_data.header.button_app_info.clone();
let check_button_music_approximate_comparison = gui_data.main_notebook.check_button_music_approximate_comparison.clone();
let check_button_image_fast_compare = gui_data.main_notebook.check_button_image_fast_compare.clone();
let check_button_settings_save_also_json = gui_data.settings.check_button_settings_save_also_json.clone();
buttons_search_clone.connect_clicked(move |_| {
let included_directories = get_path_buf_from_vector_of_strings(get_string_from_list_store(&tree_view_included_directories, ColumnsIncludedDirectory::Path as i32, None));
@ -117,6 +118,7 @@ pub fn connect_button_search(
let allowed_extensions = entry_allowed_extensions.text().as_str().to_string();
let hide_hard_links = check_button_settings_hide_hard_links.is_active();
let use_cache = check_button_settings_use_cache.is_active();
let save_also_as_json = check_button_settings_save_also_json.is_active();
let minimal_cache_file_size = entry_settings_cache_file_minimal_size.text().as_str().parse::<u64>().unwrap_or(1024 * 1024 / 4);
let minimal_file_size = entry_general_minimal_size.text().as_str().parse::<u64>().unwrap_or(1024 * 8);
@ -317,6 +319,7 @@ pub fn connect_button_search(
sf.set_delete_outdated_cache(delete_outdated_cache);
sf.set_exclude_images_with_same_size(ignore_same_size);
sf.set_fast_comparing(fast_compare);
sf.set_save_also_as_json(save_also_as_json);
sf.find_similar_images(Some(&stop_receiver), Some(&futures_sender_similar_images));
let _ = glib_stop_sender.send(Message::SimilarImages(sf));
});
@ -351,6 +354,7 @@ pub fn connect_button_search(
sf.set_tolerance(tolerance);
sf.set_delete_outdated_cache(delete_outdated_cache);
sf.set_exclude_videos_with_same_size(ignore_same_size);
sf.set_save_also_as_json(save_also_as_json);
sf.find_similar_videos(Some(&stop_receiver), Some(&futures_sender_similar_videos));
let _ = glib_stop_sender.send(Message::SimilarVideos(sf));
});
@ -450,6 +454,7 @@ pub fn connect_button_search(
br.set_excluded_items(excluded_items);
br.set_use_cache(use_cache);
br.set_allowed_extensions(allowed_extensions);
br.set_save_also_as_json(save_also_as_json);
br.find_broken_files(Some(&stop_receiver), Some(&futures_sender_broken_files));
let _ = glib_stop_sender.send(Message::BrokenFiles(br));
});

View file

@ -154,7 +154,7 @@ pub fn connect_settings(gui_data: &GuiData) {
{
for hash_alg in [HashAlg::Blockhash, HashAlg::Gradient, HashAlg::DoubleGradient, HashAlg::VertGradient, HashAlg::Mean].iter() {
if let Some(cache_entries) = czkawka_core::similar_images::load_hashes_from_file(&mut messages, true, *hash_size, *hash_alg, *image_filter) {
czkawka_core::similar_images::save_hashes_to_file(&cache_entries, &mut messages, *hash_size, *hash_alg, *image_filter);
czkawka_core::similar_images::save_hashes_to_file(&cache_entries, &mut messages, false, *hash_size, *hash_alg, *image_filter);
}
}
}
@ -182,7 +182,7 @@ pub fn connect_settings(gui_data: &GuiData) {
if response_type == ResponseType::Ok {
let mut messages: Messages = Messages::new();
if let Some(cache_entries) = czkawka_core::similar_videos::load_hashes_from_file(&mut messages, true) {
czkawka_core::similar_videos::save_hashes_to_file(&cache_entries, &mut messages);
czkawka_core::similar_videos::save_hashes_to_file(&cache_entries, &mut messages, false);
}
messages.messages.push(fl!("cache_properly_cleared"));

View file

@ -17,6 +17,7 @@ pub struct GuiSettings {
pub check_button_settings_confirm_group_deletion: gtk::CheckButton,
pub check_button_settings_show_text_view: gtk::CheckButton,
pub check_button_settings_use_cache: gtk::CheckButton,
pub check_button_settings_save_also_json: gtk::CheckButton,
pub check_button_settings_use_trash: gtk::CheckButton,
pub label_settings_general_language: gtk::Label,
pub combo_box_settings_language: gtk::ComboBoxText,
@ -70,6 +71,7 @@ impl GuiSettings {
let check_button_settings_confirm_group_deletion: gtk::CheckButton = builder.object("check_button_settings_confirm_group_deletion").unwrap();
let check_button_settings_show_text_view: gtk::CheckButton = builder.object("check_button_settings_show_text_view").unwrap();
let check_button_settings_use_cache: gtk::CheckButton = builder.object("check_button_settings_use_cache").unwrap();
let check_button_settings_save_also_json: gtk::CheckButton = builder.object("check_button_settings_save_also_json").unwrap();
let check_button_settings_use_trash: gtk::CheckButton = builder.object("check_button_settings_use_trash").unwrap();
let label_settings_general_language: gtk::Label = builder.object("label_settings_general_language").unwrap();
let combo_box_settings_language: gtk::ComboBoxText = builder.object("combo_box_settings_language").unwrap();
@ -112,6 +114,7 @@ impl GuiSettings {
check_button_settings_confirm_group_deletion,
check_button_settings_show_text_view,
check_button_settings_use_cache,
check_button_settings_save_also_json,
check_button_settings_use_trash,
label_settings_general_language,
combo_box_settings_language,
@ -147,6 +150,7 @@ impl GuiSettings {
self.check_button_settings_confirm_group_deletion.set_label(&fl!("settings_confirm_group_deletion_button"));
self.check_button_settings_show_text_view.set_label(&fl!("settings_show_text_view_button"));
self.check_button_settings_use_cache.set_label(&fl!("settings_use_cache_button"));
self.check_button_settings_save_also_json.set_label(&fl!("settings_save_also_as_json_button"));
self.check_button_settings_use_trash.set_label(&fl!("settings_use_trash_button"));
self.label_settings_general_language.set_label(&fl!("settings_language_label"));
@ -160,6 +164,8 @@ impl GuiSettings {
.set_tooltip_text(Some(&fl!("settings_confirm_group_deletion_button_tooltip")));
self.check_button_settings_show_text_view
.set_tooltip_text(Some(&fl!("settings_show_text_view_button_tooltip")));
self.check_button_settings_save_also_json
.set_tooltip_text(Some(&fl!("settings_save_also_as_json_button_tooltip")));
self.check_button_settings_use_cache.set_tooltip_text(Some(&fl!("settings_use_cache_button_tooltip")));
self.check_button_settings_use_trash.set_tooltip_text(Some(&fl!("settings_use_trash_button_tooltip")));
self.label_settings_general_language.set_tooltip_text(Some(&fl!("settings_language_label_tooltip")));

View file

@ -10,6 +10,10 @@ pub const LANGUAGES_ALL: [Language; 12] = [
combo_box_text: "English",
short_text: "en",
},
Language {
combo_box_text: "Français (French)",
short_text: "fr",
},
Language {
combo_box_text: "Italiano (Italian)",
short_text: "it",
@ -26,10 +30,6 @@ pub const LANGUAGES_ALL: [Language; 12] = [
combo_box_text: "Deutsch (German) - Computer translation",
short_text: "de",
},
Language {
combo_box_text: "Français (French) - Computer translation",
short_text: "fr",
},
Language {
combo_box_text: "やまと (Japanese) - Computer translation",
short_text: "ja",

View file

@ -27,6 +27,7 @@ const DEFAULT_SHOW_IMAGE_PREVIEW: bool = true;
const DEFAULT_SHOW_DUPLICATE_IMAGE_PREVIEW: bool = true;
const DEFAULT_BOTTOM_TEXT_VIEW: bool = true;
const DEFAULT_USE_CACHE: bool = true;
const DEFAULT_SAVE_ALSO_AS_JSON: bool = false;
const DEFAULT_HIDE_HARD_LINKS: bool = true;
const DEFAULT_USE_PRECACHE: bool = false;
const DEFAULT_USE_TRASH: bool = false;
@ -363,6 +364,7 @@ enum LoadText {
ShowBottomTextPanel,
HideHardLinks,
UseCache,
UseJsonCacheFile,
DeleteToTrash,
MinimalCacheSize,
ImagePreviewImage,
@ -388,6 +390,7 @@ fn create_hash_map() -> (HashMap<LoadText, String>, HashMap<String, LoadText>) {
(LoadText::ShowBottomTextPanel, "show_bottom_text_panel"),
(LoadText::HideHardLinks, "hide_hard_links"),
(LoadText::UseCache, "use_cache"),
(LoadText::UseJsonCacheFile, "use_json_cache_file"),
(LoadText::DeleteToTrash, "delete_to_trash"),
(LoadText::MinimalCacheSize, "minimal_cache_size"),
(LoadText::ImagePreviewImage, "image_preview_image"),
@ -475,6 +478,10 @@ pub fn save_configuration(manual_execution: bool, upper_notebook: &GuiUpperNoteb
hashmap_ls.get(&LoadText::UseCache).unwrap().to_string(),
settings.check_button_settings_use_cache.is_active(),
);
saving_struct.save_var(
hashmap_ls.get(&LoadText::UseJsonCacheFile).unwrap().to_string(),
settings.check_button_settings_save_also_json.is_active(),
);
saving_struct.save_var(
hashmap_ls.get(&LoadText::DeleteToTrash).unwrap().to_string(),
settings.check_button_settings_use_trash.is_active(),
@ -546,6 +553,7 @@ pub fn load_configuration(manual_execution: bool, upper_notebook: &GuiUpperNoteb
let bottom_text_panel: bool = loaded_entries.get_bool(hashmap_ls.get(&LoadText::ShowBottomTextPanel).unwrap().clone(), DEFAULT_BOTTOM_TEXT_VIEW);
let hide_hard_links: bool = loaded_entries.get_bool(hashmap_ls.get(&LoadText::HideHardLinks).unwrap().clone(), DEFAULT_HIDE_HARD_LINKS);
let use_cache: bool = loaded_entries.get_bool(hashmap_ls.get(&LoadText::UseCache).unwrap().clone(), DEFAULT_USE_CACHE);
let use_json_cache: bool = loaded_entries.get_bool(hashmap_ls.get(&LoadText::UseJsonCacheFile).unwrap().clone(), DEFAULT_SAVE_ALSO_AS_JSON);
let use_trash: bool = loaded_entries.get_bool(hashmap_ls.get(&LoadText::DeleteToTrash).unwrap().clone(), DEFAULT_USE_TRASH);
let delete_outdated_cache_duplicates: bool = loaded_entries.get_bool(
hashmap_ls.get(&LoadText::DuplicateDeleteOutdatedCacheEntries).unwrap().clone(),
@ -630,6 +638,7 @@ pub fn load_configuration(manual_execution: bool, upper_notebook: &GuiUpperNoteb
}
settings.check_button_settings_hide_hard_links.set_active(hide_hard_links);
settings.check_button_settings_use_cache.set_active(use_cache);
settings.check_button_settings_save_also_json.set_active(use_json_cache);
settings.check_button_duplicates_use_prehash_cache.set_active(use_prehash_cache);
settings.check_button_settings_use_trash.set_active(use_trash);
settings.entry_settings_cache_file_minimal_size.set_text(&cache_minimal_size);
@ -711,6 +720,7 @@ pub fn reset_configuration(manual_clearing: bool, upper_notebook: &GuiUpperNoteb
settings.check_button_settings_show_text_view.set_active(DEFAULT_BOTTOM_TEXT_VIEW);
settings.check_button_settings_hide_hard_links.set_active(DEFAULT_HIDE_HARD_LINKS);
settings.check_button_settings_use_cache.set_active(DEFAULT_USE_CACHE);
settings.check_button_settings_save_also_json.set_active(DEFAULT_SAVE_ALSO_AS_JSON);
settings.check_button_settings_use_trash.set_active(DEFAULT_USE_TRASH);
settings.entry_settings_cache_file_minimal_size.set_text(DEFAULT_MINIMAL_CACHE_SIZE);
settings

View file

@ -238,11 +238,26 @@ Author: Rafał Mikrut
<property name="draw-indicator">True</property>
</object>
<packing>
<property name="expand">False</property>
<property name="expand">True</property>
<property name="fill">True</property>
<property name="position">7</property>
</packing>
</child>
<child>
<object class="GtkCheckButton" id="check_button_settings_save_also_json">
<property name="label" translatable="yes">Save cache also to JSON file</property>
<property name="visible">True</property>
<property name="can-focus">True</property>
<property name="receives-default">False</property>
<property name="active">True</property>
<property name="draw-indicator">True</property>
</object>
<packing>
<property name="expand">True</property>
<property name="fill">True</property>
<property name="position">8</property>
</packing>
</child>
<child>
<object class="GtkCheckButton" id="check_button_settings_use_trash">
<property name="label" translatable="yes">Move deleted files to trash</property>
@ -255,7 +270,7 @@ Author: Rafał Mikrut
<packing>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">8</property>
<property name="position">9</property>
</packing>
</child>
</object>

View file

@ -274,6 +274,7 @@ settings_confirm_link_button_tooltip = Shows confirmation dialog when clicking a
settings_confirm_group_deletion_button_tooltip = Shows dialog when trying to remove all records from group.
settings_show_text_view_button_tooltip = Shows error panel at bottom.
settings_use_cache_button_tooltip = Option to which allows to not use cache feature.
settings_save_also_as_json_button_tooltip = Save cache to readable by human JSON format. It is possible to modify its content. Cache from this file will be read automatically by app if binary format cache (with bin extension) will be missing.
settings_use_trash_button_tooltip = When enabled it moves files to trash instead deleting them permanently.
settings_language_label_tooltip = Allows to choose language of interface from available ones.
@ -284,6 +285,7 @@ settings_confirm_link_button = Show confirm dialog when hard/symlinks any files
settings_confirm_group_deletion_button = Show confirm dialog when deleting all files in group
settings_show_text_view_button = Show bottom text panel
settings_use_cache_button = Use cache
settings_save_also_as_json_button = Save cache also to JSON file
settings_use_trash_button = Move deleted files to trash
settings_language_label = Language

View file

@ -5,7 +5,7 @@ If you use Snap, Flatpak or Appimage, you need to only install ffmpeg if you wan
For Czkawka GUI the lowest supported version of GTK is `3.24` which is the only required dependency(of course on Ubuntu, different distributions will probably require a little different set of dependences).
In app exists Similar Video tool which require `FFmpeg` to work, but is completelly optional and without it, only warning would be printed when trying to use this tool without installed ffmpeg.
Broken files finder by default don't check for music files, and it is possible to enable this feature but it require to have alsa lib installed(on Ubuntu this is `libasound2-dev` package)
Broken files finder by default don't check for music files, but it is possible to enable this feature and that require to have alsa lib installed(on Ubuntu this is `libasound2-dev` package)
#### Ubuntu/Debian/Linux Mint
```

View file

@ -9,37 +9,23 @@
Czkawka for now contains two independent frontends - the terminal and graphical interface which share the core module.
## GUI GTK
<img src="https://user-images.githubusercontent.com/41945903/103002387-14d1b800-452f-11eb-967e-9d5905dd6db5.png" width="800" />
<img src="https://user-images.githubusercontent.com/41945903/148281103-13c00d08-7881-43e8-b6e3-5178473bce85.png" width="800" />
### GUI overview
The GUI is built from different pieces:
- Red - Program settings, contains info about included/excluded directories which user may want to check. Also, there is a tab with allowed extensions, which allows users to choose which type of files they want to check. Next category is Excluded items, which allows to discard specific path by using `*` wildcard - so `/home/ra*` means that e.g. `/home/rafal/` will be ignored but not `/home/czkawka/`. The last one is settings tab which allows to save configuration of the program, reset and load it when needed.
- Green - This allows to choose which tool we want to use.
- Blue - Here are settings for the current tool, which we want/need to configure
- Pink - Window in which results of searching are printed
- Yellow - Box with buttons like `Search`(starts searching with the currently selected tool), `Hide Text View`(hides text box at the bottom with white overlay), `Symlink`(creates symlink to selected file), `Select`(shows options to select specific rows), `Delete`(deletes selected files), `Save`(save to file the search result) - some buttons are only visible when at least one result is visible.
- Brown - Small informative field to show informations e.g. about number of found duplicate files
- White - Text window to show possible errors/warnings e.g. when failed to delete folder due no permissions etc.
- 1 - Image preview - it is used in duplicate files and similar images finder. Cannot be resized, but can be disabled.
- 2 - Main Notebook to change used tool.
- 3 - Main results window - allows to choose, delete, configure results.
- 4 - Bottom image panels - contains buttons which do specific actions on data(like selecting them) or e.g. hide/show parts of GUI
- 5 - Text panel - prints messages/warnings/errors about executed actions. User can hide it.
- 6 - Panel with selecting specific directories to use or exclude. Also here are specified allowed extensions and file sizes.
- 7 - Buttons which opens About Window(shows info about app) and Settings in which scan can be customized
There is also an option to see image previews in Similar Images tool.
<img src="https://user-images.githubusercontent.com/41945903/148279809-54ea8684-8bff-436b-af67-ff9859f468f2.png" width="800" />
<img src="https://user-images.githubusercontent.com/41945903/103025544-50ca4480-4552-11eb-9a54-f1b1f6f725b1.png" width="800" />
### Action Buttons
There are several buttons which do different actions:
- Search - starts searching and shows progress dialog
- Stop - button in progress dialog, allows to easily stop current task. Sometimes it may take a few seconds until all atomic operations end and GUI will become responsive again
- Select - allows selecting multiple entries at once
- Delete - deletes entirely all selected entries
- Symlink - creates symlink to selected files(first file is threaten as original and rest will become symlinks)
- Save - save initial state of results
- Hamburger(parallel lines) - used to show/hide bottom text panel which shows warnings/errors
- Add (directories) - adds directories to include or exclude
- Remove (directories) - removes directories to search or to exclude
- Manual Add (directories) - allows to input by typing directories (may be used to enter non visible in file manager directories)
- Save current configuration - saves current GUI configuration to configuration file
- Load configuration - loads configuration of file and overrides current GUI config
- Reset configuration - resets current GUI configuration to defaults
### Translations
GUI is fully translatable.
For now at least 10 languages are supported(some was translated by computers)
### Opening/Manipulating files
It is possible to open selected files by double clicking on them.
@ -67,10 +53,13 @@ By default, all tools only write about results to console, but it is possible wi
## Config/Cache files
Currently, Czkawka stores few config and cache files on disk:
- `czkawka_gui_config.txt` - stores configuration of GUI which may be loaded at startup
- `cache_similar_image_SIZE_HASH_FILTER.txt` - stores cache data and hashes which may be used later without needing to compute image hash again - editing this file manually is not recommended, but it is allowed. Each algorithms uses its own file, because hashes are completely different in each.
- `cache_similar_image_SIZE_HASH_FILTER.bin/json` - stores cache data and hashes which may be used later without needing to compute image hash again.. Each algorithms uses its own file, because hashes are completely different in each.
- `cache_broken_files.txt` - stores cache data of broken files
- `cache_duplicates_Blake3.txt` - stores cache data of duplicated files, to not suffer too big of a performance hit when saving/loading file, only already fully hashed files bigger than 5MB are stored. Similar files with replaced `Blake3` to e.g. `SHA256` may be shown, when support for new hashes will be introduced in Czkawka.
- `cache_similar_videos.bin/json` - stores cache data of video files. Depending on usage(debug/release build) it will produce bin file(fast, but unable to change) or json(slow, but well formatted).
- `cache_duplicates_HASH.txt` - stores cache data of duplicated files, to not suffer too big of a performance hit when saving/loading file, only already fully hashed files bigger than 5MB are stored. Similar files with replaced `Blake3` to e.g. `SHA256` may be shown, when support for new hashes will be introduced in Czkawka.
- `cache_similar_videos.bin/json` - stores cache data of video files.
Editing `bin` files may cause showing strange crashes, so in case of having any, removing these files should help.
It is possible to modify files with JSON extension(may be helpful when moving files to different disk or trying to use cache file on different computer). To do this, it is required to enable in settings option to generate also cache json file. Next file can be changed/modified. By default cache files with `bin` extension are loaded, but if it is missing(can be renamed or removed), then data from json file is loaded if exists.
Config files are located in this path:
@ -92,7 +81,7 @@ Windows - `C:\Users\Username\AppData\Local\Qarmin\Czkawka\cache`
- **Not all columns are always visible**
For now it is possible that some columns will not be visible when some are too wide. There are 2 workarounds for now
- View can be scrolled via horizontal scroll bar (1 on image)
- Size of other columns can be slimmed (2 )
- Size of other columns can be slimmed (2)
This is checked if is possible to do in https://github.com/qarmin/czkawka/issues/169
![AA](https://user-images.githubusercontent.com/41945903/125684641-728e264a-34ab-41b1-9853-ab45dc25551f.png)
- **Opening parent folders**
@ -226,11 +215,11 @@ Next these hashes are saved to file, to be able to open images without needing t
Finally, each hash is compared with the others and if the distance between them is less than the maximum distance specified by the user, the images are considered similar and thrown from the pool of images to be searched.
It is possible to choose one of 5 types of hashes - `Gradient`, `Mean`, `VertGradient`, `Blockhash`, `DoubleGradient`.
Before calculating hashes usually images are resized with specific algorithm(`Lanczos3`, `Gaussian`, `CatmullRom`, `Triangle`, `Nearest`) to e.g. 8x8 or 16x16 image(allowed sizes - `4x4`, `8x8`, `16x16`), which allows simplifying later computations. Both size and filter can be adjusted in application.
Before calculating hashes usually images are resized with specific algorithm(`Lanczos3`, `Gaussian`, `CatmullRom`, `Triangle`, `Nearest`) to e.g. 8x8 or 16x16 image(allowed sizes - `8x8`, `16x16`, `32x32`, `64x64`), which allows simplifying later computations. Both size and filter can be adjusted in application.
Each configuration saves results to different cache files to save users from invalid results.
Some images broke hash functions and create hashes full of `0` or `255`, so these images are silently excluded from end results(but still are saved to cache).
Some images broke hash functions and create hashes full of `0` or `255`, so these images are silently excluded from end results(but still are saved to cache).
You can test each algorithm with provided CLI tool, just put to folder `test.jpg` file and run inside this command `czkawka_cli tester -i`