Pythonで楽曲分析を自動化する方法｜音楽理論とAIを融合した次世代分析システム

音楽制作において楽曲分析は欠かせない作業ですが、手動では膨大な時間と労力が必要です。今回はPythonを使って楽曲の構造、コード進行、リズムパターン、音響特徴を自動的に分析するシステムの構築方法を、実践的なコード例と共に詳しく解説します。

なぜPythonで楽曲分析を自動化するのか？
必要なライブラリとセットアップ
- 1. 基本環境の構築
- 2. 開発環境の推奨設定
基本的な楽曲分析システムの構築
- 1. 音声ファイルの読み込みと前処理
- 2. 楽曲構造の自動検出
コード進行の自動検出
- 1. クロマベクトルを使用したコード検出
リズム・ビート分析
- 1. テンポとビート検出
音響特徴の詳細分析
- 1. スペクトル分析と音色特徴
統合分析システムの構築
- 1. 包括的な楽曲分析クラス
実践的な使用例
高度な応用：機械学習との統合
- 1. ジャンル分類器の構築
パフォーマンス最適化
- 1. 並列処理による高速化
実用的なアプリケーション例
- 1. プレイリスト自動生成
まとめ：音楽分析の未来

なぜPythonで楽曲分析を自動化するのか？

現代の音楽制作では、大量の楽曲を効率的に分析し、そこから得られる知見を創作活動に活かすことが重要です。Pythonは豊富な音楽処理ライブラリと機械学習フレームワークを持ち、楽曲分析の自動化に最適な言語です。

🎵 Python楽曲分析の利点

高速処理：数千曲規模のバッチ処理が可能
再現性：同じ条件で何度でも同じ結果を得られる
拡張性：新しい分析手法を簡単に追加できる
統合性：他のツールやDAWとの連携が容易
可視化：分析結果をグラフやチャートで表現

必要なライブラリとセットアップ

1. 基本環境の構築

📦 必須ライブラリのインストール


# 音楽分析用ライブラリ
pip install librosa soundfile
pip install music21 mido pretty_midi

# 数値計算・データ処理
pip install numpy scipy pandas

# 可視化
pip install matplotlib seaborn plotly

# 機械学習
pip install scikit-learn tensorflow

# 信号処理
pip install pydub essentia

# 音楽分析用ライブラリ

pip install librosa soundfile

pip install music21 mido pretty_midi

# 数値計算・データ処理

pip install numpy scipy pandas

# 可視化

pip install matplotlib seaborn plotly

# 機械学習

pip install scikit-learn tensorflow

# 信号処理

pip install pydub essentia

2. 開発環境の推奨設定

💻 推奨開発環境

Python：3.8以上（3.10推奨）
IDE：Jupyter Lab または VSCode
メモリ：最低8GB（16GB以上推奨）
GPU：深層学習を使う場合はCUDA対応GPU
OS：Windows/Mac/Linux（Ubuntuが最も安定）

基本的な楽曲分析システムの構築

1. 音声ファイルの読み込みと前処理

🎼 音声ファイル処理の基本クラス


import librosa
import numpy as np
import soundfile as sf
from typing import Tuple, Dict, List, Optional
import warnings
warnings.filterwarnings('ignore')

class AudioProcessor:
    """
    音声ファイルの読み込みと前処理を行うクラス
    """
    def __init__(self, target_sr: int = 22050):
        self.target_sr = target_sr
        self.audio_data = None
        self.sr = None
        self.duration = None
        
    def load_audio(self, filepath: str) -&gt; Tuple[np.ndarray, int]:
        """
        音声ファイルを読み込む
        
        Parameters:
            filepath: 音声ファイルのパス
            
        Returns:
            audio_data: 音声データ（numpy配列）
            sr: サンプリングレート
        """
        try:
            # librosaで読み込み（自動的にモノラルに変換）
            audio_data, sr = librosa.load(filepath, sr=self.target_sr)
            
            # 正規化（-1から1の範囲に）
            audio_data = librosa.util.normalize(audio_data)
            
            self.audio_data = audio_data
            self.sr = sr
            self.duration = len(audio_data) / sr
            
            print(f"✅ 音声ファイル読み込み完了")
            print(f"   - サンプリングレート: {sr} Hz")
            print(f"   - 長さ: {self.duration:.2f} 秒")
            print(f"   - サンプル数: {len(audio_data):,}")
            
            return audio_data, sr
            
        except Exception as e:
            print(f"❌ エラー: {e}")
            return None, None
    
    def apply_preprocessing(self, 
                          remove_silence: bool = True,
                          normalize: bool = True,
                          trim_threshold: float = 20) -&gt; np.ndarray:
        """
        音声データの前処理
        
        Parameters:
            remove_silence: 無音部分を除去するか
            normalize: 正規化を行うか
            trim_threshold: 無音判定の閾値（dB）
            
        Returns:
            processed_audio: 前処理済み音声データ
        """
        if self.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        processed_audio = self.audio_data.copy()
        
        # 無音除去
        if remove_silence:
            processed_audio, _ = librosa.effects.trim(
                processed_audio, 
                top_db=trim_threshold
            )
            print(f"✅ 無音除去完了（閾値: {trim_threshold} dB）")
        
        # 正規化
        if normalize:
            processed_audio = librosa.util.normalize(processed_audio)
            print("✅ 正規化完了")
            
        return processed_audio
    
    def extract_segments(self, 
                        segment_duration: float = 30.0,
                        overlap: float = 0.5) -&gt; List[np.ndarray]:
        """
        音声を一定長のセグメントに分割
        
        Parameters:
            segment_duration: セグメントの長さ（秒）
            overlap: オーバーラップ率（0-1）
            
        Returns:
            segments: セグメントのリスト
        """
        if self.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        segment_samples = int(segment_duration * self.sr)
        hop_samples = int(segment_samples * (1 - overlap))
        
        segments = []
        for start in range(0, len(self.audio_data) - segment_samples + 1, hop_samples):
            segment = self.audio_data[start:start + segment_samples]
            segments.append(segment)
            
        print(f"✅ セグメント分割完了: {len(segments)} セグメント")
        return segments

100

101

102

103

104

105

106

107

108

109

110

111

112

113

import librosa

import numpy as np

import soundfile as sf

from typing import Tuple, Dict, List, Optional

import warnings

warnings.filterwarnings('ignore')

class AudioProcessor:

"""

音声ファイルの読み込みと前処理を行うクラス

"""

def __init__(self, target_sr: int = 22050):

self.target_sr = target_sr

self.audio_data = None

self.sr = None

self.duration = None

def load_audio(self, filepath: str) -> Tuple[np.ndarray, int]:

"""

音声ファイルを読み込む

Parameters:

filepath: 音声ファイルのパス

Returns:

audio_data: 音声データ（numpy配列）

sr: サンプリングレート

"""

try:

# librosaで読み込み（自動的にモノラルに変換）

audio_data, sr = librosa.load(filepath, sr=self.target_sr)

# 正規化（-1から1の範囲に）

audio_data = librosa.util.normalize(audio_data)

self.audio_data = audio_data

self.sr = sr

self.duration = len(audio_data) / sr

print(f"✅ 音声ファイル読み込み完了")

print(f" - サンプリングレート: {sr} Hz")

print(f" - 長さ: {self.duration:.2f} 秒")

print(f" - サンプル数: {len(audio_data):,}")

return audio_data, sr

except Exception as e:

print(f"❌ エラー: {e}")

return None, None

def apply_preprocessing(self,

remove_silence: bool = True,

normalize: bool = True,

trim_threshold: float = 20) -> np.ndarray:

"""

音声データの前処理

Parameters:

remove_silence: 無音部分を除去するか

normalize: 正規化を行うか

trim_threshold: 無音判定の閾値（dB）

Returns:

processed_audio: 前処理済み音声データ

"""

if self.audio_data is None:

raise ValueError("音声データが読み込まれていません")

processed_audio = self.audio_data.copy()

# 無音除去

if remove_silence:

processed_audio, _ = librosa.effects.trim(

processed_audio,

top_db=trim_threshold

)

print(f"✅ 無音除去完了（閾値: {trim_threshold} dB）")

# 正規化

if normalize:

processed_audio = librosa.util.normalize(processed_audio)

print("✅ 正規化完了")

return processed_audio

def extract_segments(self,

segment_duration: float = 30.0,

overlap: float = 0.5) -> List[np.ndarray]:

"""

音声を一定長のセグメントに分割

Parameters:

segment_duration: セグメントの長さ（秒）

overlap: オーバーラップ率（0-1）

Returns:

segments: セグメントのリスト

"""

if self.audio_data is None:

raise ValueError("音声データが読み込まれていません")

segment_samples = int(segment_duration * self.sr)

hop_samples = int(segment_samples * (1 - overlap))

segments = []

for start in range(0, len(self.audio_data) - segment_samples + 1, hop_samples):

segment = self.audio_data[start:start + segment_samples]

segments.append(segment)

print(f"✅ セグメント分割完了: {len(segments)} セグメント")

return segments

2. 楽曲構造の自動検出

楽曲の構造（イントロ、Aメロ、サビなど）を自動的に検出するシステムです。

🏗️ 楽曲構造分析クラス


import librosa.display
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from scipy.spatial.distance import pdist, squareform

class StructureAnalyzer:
    """
    楽曲構造を分析するクラス
    """
    def __init__(self, audio_processor: AudioProcessor):
        self.processor = audio_processor
        self.segments = None
        self.features = None
        self.structure = None
        
    def extract_features(self, feature_type: str = 'mfcc') -&gt; np.ndarray:
        """
        音響特徴量を抽出
        
        Parameters:
            feature_type: 特徴量の種類（'mfcc', 'chroma', 'spectral'）
            
        Returns:
            features: 特徴量行列
        """
        if self.processor.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        audio = self.processor.audio_data
        sr = self.processor.sr
        
        if feature_type == 'mfcc':
            # MFCC（メル周波数ケプストラム係数）
            features = librosa.feature.mfcc(
                y=audio, sr=sr, n_mfcc=13, hop_length=512
            )
            
        elif feature_type == 'chroma':
            # クロマベクトル（音高クラス分布）
            features = librosa.feature.chroma_cqt(
                y=audio, sr=sr, hop_length=512
            )
            
        elif feature_type == 'spectral':
            # スペクトル特徴量の組み合わせ
            spectral_centroids = librosa.feature.spectral_centroid(
                y=audio, sr=sr, hop_length=512
            )
            spectral_rolloff = librosa.feature.spectral_rolloff(
                y=audio, sr=sr, hop_length=512
            )
            spectral_contrast = librosa.feature.spectral_contrast(
                y=audio, sr=sr, hop_length=512
            )
            features = np.vstack([
                spectral_centroids,
                spectral_rolloff,
                spectral_contrast
            ])
            
        else:
            # 複合特徴量
            mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)
            chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)
            tempogram = librosa.feature.tempogram(y=audio, sr=sr)
            features = np.vstack([mfcc, chroma, tempogram])
            
        self.features = features
        print(f"✅ 特徴量抽出完了: {features.shape}")
        return features
    
    def compute_self_similarity_matrix(self) -&gt; np.ndarray:
        """
        自己類似度行列を計算
        
        Returns:
            ssm: 自己類似度行列
        """
        if self.features is None:
            raise ValueError("特徴量が抽出されていません")
            
        # 特徴量を時間軸で平均化（ダウンサンプリング）
        hop_length = 512
        sr = self.processor.sr
        frame_rate = sr / hop_length
        
        # 1秒ごとにダウンサンプリング
        downsampling_rate = int(frame_rate)
        features_downsampled = self.features[:, ::downsampling_rate]
        
        # コサイン距離で自己類似度行列を計算
        features_norm = features_downsampled.T
        features_norm = features_norm / np.linalg.norm(
            features_norm, axis=1, keepdims=True
        )
        ssm = np.dot(features_norm, features_norm.T)
        
        print(f"✅ 自己類似度行列計算完了: {ssm.shape}")
        return ssm
    
    def detect_segments(self, ssm: np.ndarray, n_segments: int = 8) -&gt; Dict:
        """
        楽曲セグメントを検出
        
        Parameters:
            ssm: 自己類似度行列
            n_segments: セグメント数の推定値
            
        Returns:
            segments: セグメント情報の辞書
        """
        # Novelty curve（新規性曲線）の計算
        kernel_size = 64
        kernel = np.eye(kernel_size)
        
        # チェッカーボードカーネル
        kernel[:kernel_size//2, :kernel_size//2] *= -1
        kernel[kernel_size//2:, kernel_size//2:] *= -1
        
        # 畳み込みで新規性を計算
        novelty = np.zeros(len(ssm))
        for i in range(kernel_size//2, len(ssm) - kernel_size//2):
            start = i - kernel_size//2
            end = i + kernel_size//2
            novelty[i] = np.sum(ssm[start:end, start:end] * kernel)
        
        # ピーク検出でセグメント境界を見つける
        from scipy.signal import find_peaks
        peaks, properties = find_peaks(
            novelty, 
            prominence=np.std(novelty),
            distance=20
        )
        
        # セグメント情報を構築
        segments = []
        boundaries = [0] + list(peaks) + [len(ssm)]
        
        for i in range(len(boundaries) - 1):
            start_time = boundaries[i]
            end_time = boundaries[i + 1]
            duration = end_time - start_time
            
            segments.append({
                'start': start_time,
                'end': end_time,
                'duration': duration,
                'label': f'Section_{i+1}'
            })
        
        # セグメントのクラスタリング
        if len(segments) &gt; n_segments:
            # 特徴量でクラスタリング
            segment_features = []
            for seg in segments:
                start_idx = seg['start']
                end_idx = seg['end']
                seg_feat = self.features[:, start_idx:end_idx].mean(axis=1)
                segment_features.append(seg_feat)
            
            kmeans = KMeans(n_clusters=n_segments, random_state=42)
            labels = kmeans.fit_predict(segment_features)
            
            # ラベルを更新
            section_names = ['Intro', 'Verse', 'Pre-Chorus', 'Chorus', 
                           'Bridge', 'Outro', 'Instrumental', 'Break']
            for i, seg in enumerate(segments):
                if labels[i] &lt; len(section_names):
                    seg['label'] = section_names[labels[i]]
        
        self.structure = {
            'segments': segments,
            'boundaries': boundaries,
            'novelty': novelty
        }
        
        print(f"✅ セグメント検出完了: {len(segments)} セグメント")
        return self.structure
    
    def visualize_structure(self, save_path: Optional[str] = None):
        """
        楽曲構造を可視化
        """
        if self.structure is None:
            raise ValueError("楽曲構造が分析されていません")
            
        fig, axes = plt.subplots(3, 1, figsize=(12, 10))
        
        # 1. 波形とセグメント
        ax1 = axes[0]
        time = np.linspace(0, self.processor.duration, len(self.processor.audio_data))
        ax1.plot(time, self.processor.audio_data, alpha=0.6)
        
        # セグメント境界を表示
        colors = plt.cm.Set3(np.linspace(0, 1, len(self.structure['segments'])))
        for i, seg in enumerate(self.structure['segments']):
            start_time = seg['start']
            end_time = seg['end']
            ax1.axvspan(start_time, end_time, alpha=0.3, 
                       color=colors[i], label=seg['label'])
        
        ax1.set_xlabel('Time (s)')
        ax1.set_ylabel('Amplitude')
        ax1.set_title('Waveform with Detected Segments')
        ax1.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
        
        # 2. 自己類似度行列
        ax2 = axes[1]
        ssm = self.compute_self_similarity_matrix()
        im = ax2.imshow(ssm, cmap='hot', aspect='auto', origin='lower')
        
        # セグメント境界を表示
        for boundary in self.structure['boundaries'][1:-1]:
            ax2.axvline(x=boundary, color='cyan', linestyle='--', alpha=0.8)
            ax2.axhline(y=boundary, color='cyan', linestyle='--', alpha=0.8)
        
        ax2.set_xlabel('Time (s)')
        ax2.set_ylabel('Time (s)')
        ax2.set_title('Self-Similarity Matrix')
        
        # 3. Novelty curve
        ax3 = axes[2]
        ax3.plot(self.structure['novelty'])
        
        # ピークを表示
        for boundary in self.structure['boundaries'][1:-1]:
            ax3.axvline(x=boundary, color='red', linestyle='--', alpha=0.8)
        
        ax3.set_xlabel('Time (s)')
        ax3.set_ylabel('Novelty')
        ax3.set_title('Novelty Curve with Segment Boundaries')
        
        plt.tight_layout()
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
            print(f"✅ 構造分析図を保存: {save_path}")
        
        plt.show()

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

import librosa.display

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from scipy.spatial.distance import pdist, squareform

class StructureAnalyzer:

"""

楽曲構造を分析するクラス

"""

def __init__(self, audio_processor: AudioProcessor):

self.processor = audio_processor

self.segments = None

self.features = None

self.structure = None

def extract_features(self, feature_type: str = 'mfcc') -> np.ndarray:

"""

音響特徴量を抽出

Parameters:

feature_type: 特徴量の種類（'mfcc', 'chroma', 'spectral'）

Returns:

features: 特徴量行列

"""

if self.processor.audio_data is None:

raise ValueError("音声データが読み込まれていません")

audio = self.processor.audio_data

sr = self.processor.sr

if feature_type == 'mfcc':

# MFCC（メル周波数ケプストラム係数）

features = librosa.feature.mfcc(

y=audio, sr=sr, n_mfcc=13, hop_length=512

)

elif feature_type == 'chroma':

# クロマベクトル（音高クラス分布）

features = librosa.feature.chroma_cqt(

y=audio, sr=sr, hop_length=512

)

elif feature_type == 'spectral':

# スペクトル特徴量の組み合わせ

spectral_centroids = librosa.feature.spectral_centroid(

y=audio, sr=sr, hop_length=512

)

spectral_rolloff = librosa.feature.spectral_rolloff(

y=audio, sr=sr, hop_length=512

)

spectral_contrast = librosa.feature.spectral_contrast(

y=audio, sr=sr, hop_length=512

)

features = np.vstack([

spectral_centroids,

spectral_rolloff,

spectral_contrast

])

else:

# 複合特徴量

mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)

chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)

tempogram = librosa.feature.tempogram(y=audio, sr=sr)

features = np.vstack([mfcc, chroma, tempogram])

self.features = features

print(f"✅ 特徴量抽出完了: {features.shape}")

return features

def compute_self_similarity_matrix(self) -> np.ndarray:

"""

自己類似度行列を計算

Returns:

ssm: 自己類似度行列

"""

if self.features is None:

raise ValueError("特徴量が抽出されていません")

# 特徴量を時間軸で平均化（ダウンサンプリング）

hop_length = 512

sr = self.processor.sr

frame_rate = sr / hop_length

# 1秒ごとにダウンサンプリング

downsampling_rate = int(frame_rate)

features_downsampled = self.features[:, ::downsampling_rate]

# コサイン距離で自己類似度行列を計算

features_norm = features_downsampled.T

features_norm = features_norm / np.linalg.norm(

features_norm, axis=1, keepdims=True

)

ssm = np.dot(features_norm, features_norm.T)

print(f"✅ 自己類似度行列計算完了: {ssm.shape}")

return ssm

def detect_segments(self, ssm: np.ndarray, n_segments: int = 8) -> Dict:

"""

楽曲セグメントを検出

Parameters:

ssm: 自己類似度行列

n_segments: セグメント数の推定値

Returns:

segments: セグメント情報の辞書

"""

# Novelty curve（新規性曲線）の計算

kernel_size = 64

kernel = np.eye(kernel_size)

# チェッカーボードカーネル

kernel[:kernel_size//2, :kernel_size//2] *= -1

kernel[kernel_size//2:, kernel_size//2:] *= -1

# 畳み込みで新規性を計算

novelty = np.zeros(len(ssm))

for i in range(kernel_size//2, len(ssm) - kernel_size//2):

start = i - kernel_size//2

end = i + kernel_size//2

novelty[i] = np.sum(ssm[start:end, start:end] * kernel)

# ピーク検出でセグメント境界を見つける

from scipy.signal import find_peaks

peaks, properties = find_peaks(

novelty,

prominence=np.std(novelty),

distance=20

)

# セグメント情報を構築

segments = []

boundaries = [0] + list(peaks) + [len(ssm)]

for i in range(len(boundaries) - 1):

start_time = boundaries[i]

end_time = boundaries[i + 1]

duration = end_time - start_time

segments.append({

'start': start_time,

'end': end_time,

'duration': duration,

'label': f'Section_{i+1}'

})

# セグメントのクラスタリング

if len(segments) > n_segments:

# 特徴量でクラスタリング

segment_features = []

for seg in segments:

start_idx = seg['start']

end_idx = seg['end']

seg_feat = self.features[:, start_idx:end_idx].mean(axis=1)

segment_features.append(seg_feat)

kmeans = KMeans(n_clusters=n_segments, random_state=42)

labels = kmeans.fit_predict(segment_features)

# ラベルを更新

section_names = ['Intro', 'Verse', 'Pre-Chorus', 'Chorus',

'Bridge', 'Outro', 'Instrumental', 'Break']

for i, seg in enumerate(segments):

if labels[i] < len(section_names):

seg['label'] = section_names[labels[i]]

self.structure = {

'segments': segments,

'boundaries': boundaries,

'novelty': novelty

}

print(f"✅ セグメント検出完了: {len(segments)} セグメント")

return self.structure

def visualize_structure(self, save_path: Optional[str] = None):

"""

楽曲構造を可視化

"""

if self.structure is None:

raise ValueError("楽曲構造が分析されていません")

fig, axes = plt.subplots(3, 1, figsize=(12, 10))

# 1. 波形とセグメント

ax1 = axes[0]

time = np.linspace(0, self.processor.duration, len(self.processor.audio_data))

ax1.plot(time, self.processor.audio_data, alpha=0.6)

# セグメント境界を表示

colors = plt.cm.Set3(np.linspace(0, 1, len(self.structure['segments'])))

for i, seg in enumerate(self.structure['segments']):

start_time = seg['start']

end_time = seg['end']

ax1.axvspan(start_time, end_time, alpha=0.3,

color=colors[i], label=seg['label'])

ax1.set_xlabel('Time (s)')

ax1.set_ylabel('Amplitude')

ax1.set_title('Waveform with Detected Segments')

ax1.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# 2. 自己類似度行列

ax2 = axes[1]

ssm = self.compute_self_similarity_matrix()

im = ax2.imshow(ssm, cmap='hot', aspect='auto', origin='lower')

# セグメント境界を表示

for boundary in self.structure['boundaries'][1:-1]:

ax2.axvline(x=boundary, color='cyan', linestyle='--', alpha=0.8)

ax2.axhline(y=boundary, color='cyan', linestyle='--', alpha=0.8)

ax2.set_xlabel('Time (s)')

ax2.set_ylabel('Time (s)')

ax2.set_title('Self-Similarity Matrix')

# 3. Novelty curve

ax3 = axes[2]

ax3.plot(self.structure['novelty'])

# ピークを表示

for boundary in self.structure['boundaries'][1:-1]:

ax3.axvline(x=boundary, color='red', linestyle='--', alpha=0.8)

ax3.set_xlabel('Time (s)')

ax3.set_ylabel('Novelty')

ax3.set_title('Novelty Curve with Segment Boundaries')

plt.tight_layout()

if save_path:

plt.savefig(save_path, dpi=300, bbox_inches='tight')

print(f"✅ 構造分析図を保存: {save_path}")

plt.show()

コード進行の自動検出

1. クロマベクトルを使用したコード検出

🎹 コード進行検出クラス


from scipy.stats import mode
import music21 as m21

class ChordProgressionAnalyzer:
    """
    コード進行を分析するクラス
    """
    def __init__(self, audio_processor: AudioProcessor):
        self.processor = audio_processor
        self.chroma = None
        self.chord_sequence = None
        self.chord_templates = self._create_chord_templates()
        
    def _create_chord_templates(self) -&gt; Dict[str, np.ndarray]:
        """
        コードテンプレートを作成
        """
        templates = {}
        
        # メジャーコード
        for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F', 
                                 'F#', 'G', 'G#', 'A', 'A#', 'B']):
            template = np.zeros(12)
            template[i] = 1.0  # ルート
            template[(i + 4) % 12] = 0.8  # 長3度
            template[(i + 7) % 12] = 0.8  # 完全5度
            templates[root] = template
            
        # マイナーコード
        for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F', 
                                 'F#', 'G', 'G#', 'A', 'A#', 'B']):
            template = np.zeros(12)
            template[i] = 1.0  # ルート
            template[(i + 3) % 12] = 0.8  # 短3度
            template[(i + 7) % 12] = 0.8  # 完全5度
            templates[root + 'm'] = template
            
        # 7thコード
        for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F', 
                                 'F#', 'G', 'G#', 'A', 'A#', 'B']):
            template = np.zeros(12)
            template[i] = 1.0  # ルート
            template[(i + 4) % 12] = 0.8  # 長3度
            template[(i + 7) % 12] = 0.8  # 完全5度
            template[(i + 10) % 12] = 0.6  # 短7度
            templates[root + '7'] = template
            
        return templates
    
    def extract_chromagram(self, hop_length: int = 512) -&gt; np.ndarray:
        """
        クロマグラムを抽出
        """
        if self.processor.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        # クロマグラムの計算
        self.chroma = librosa.feature.chroma_cqt(
            y=self.processor.audio_data,
            sr=self.processor.sr,
            hop_length=hop_length,
            n_chroma=12
        )
        
        print(f"✅ クロマグラム抽出完了: {self.chroma.shape}")
        return self.chroma
    
    def detect_chords(self, 
                     segment_length: float = 0.5,
                     smoothing: bool = True) -&gt; List[Dict]:
        """
        コードを検出
        
        Parameters:
            segment_length: コード検出の時間単位（秒）
            smoothing: スムージングを適用するか
            
        Returns:
            chord_sequence: コードシーケンス
        """
        if self.chroma is None:
            self.extract_chromagram()
            
        hop_length = 512
        sr = self.processor.sr
        frame_rate = sr / hop_length
        
        # セグメントごとにコードを検出
        segment_frames = int(segment_length * frame_rate)
        n_segments = self.chroma.shape[1] // segment_frames
        
        chord_sequence = []
        
        for i in range(n_segments):
            start_frame = i * segment_frames
            end_frame = min((i + 1) * segment_frames, self.chroma.shape[1])
            
            # セグメント内のクロマベクトルを平均化
            segment_chroma = np.mean(
                self.chroma[:, start_frame:end_frame], axis=1
            )
            
            # 正規化
            if np.sum(segment_chroma) &gt; 0:
                segment_chroma = segment_chroma / np.sum(segment_chroma)
            
            # テンプレートマッチング
            best_chord = None
            best_score = -1
            
            for chord_name, template in self.chord_templates.items():
                # コサイン類似度
                score = np.dot(segment_chroma, template) / (
                    np.linalg.norm(segment_chroma) * np.linalg.norm(template)
                )
                
                if score &gt; best_score:
                    best_score = score
                    best_chord = chord_name
            
            chord_info = {
                'chord': best_chord,
                'confidence': best_score,
                'start_time': i * segment_length,
                'end_time': (i + 1) * segment_length,
                'chroma': segment_chroma
            }
            
            chord_sequence.append(chord_info)
        
        # スムージング（短時間の変化を除去）
        if smoothing:
            chord_sequence = self._smooth_chord_sequence(chord_sequence)
        
        self.chord_sequence = chord_sequence
        print(f"✅ コード検出完了: {len(chord_sequence)} コード")
        
        return chord_sequence
    
    def _smooth_chord_sequence(self, 
                              chord_sequence: List[Dict],
                              min_duration: float = 1.0) -&gt; List[Dict]:
        """
        コードシーケンスをスムージング
        """
        smoothed = []
        current_chord = None
        current_start = 0
        
        for i, chord_info in enumerate(chord_sequence):
            if current_chord is None:
                current_chord = chord_info['chord']
                current_start = chord_info['start_time']
            
            elif chord_info['chord'] != current_chord:
                # コードが変わった場合
                duration = chord_info['start_time'] - current_start
                
                if duration &gt;= min_duration:
                    # 十分な長さがある場合は確定
                    smoothed.append({
                        'chord': current_chord,
                        'start_time': current_start,
                        'end_time': chord_info['start_time'],
                        'duration': duration
                    })
                    current_chord = chord_info['chord']
                    current_start = chord_info['start_time']
                else:
                    # 短すぎる場合は無視
                    pass
        
        # 最後のコード
        if current_chord:
            smoothed.append({
                'chord': current_chord,
                'start_time': current_start,
                'end_time': chord_sequence[-1]['end_time'],
                'duration': chord_sequence[-1]['end_time'] - current_start
            })
        
        return smoothed
    
    def analyze_chord_progression_patterns(self) -&gt; Dict:
        """
        コード進行パターンを分析
        """
        if self.chord_sequence is None:
            raise ValueError("コードが検出されていません")
            
        # コード進行の統計
        chord_counts = {}
        transitions = {}
        
        for i, chord_info in enumerate(self.chord_sequence):
            chord = chord_info['chord']
            
            # コードの出現回数
            chord_counts[chord] = chord_counts.get(chord, 0) + 1
            
            # コード遷移
            if i &lt; len(self.chord_sequence) - 1: next_chord = self.chord_sequence[i + 1]['chord'] transition = f"{chord} -&gt; {next_chord}"
                transitions[transition] = transitions.get(transition, 0) + 1
        
        # よく使われる進行パターンの検出
        common_progressions = {
            'ii-V-I': ['Dm', 'G7', 'C'],
            'I-V-vi-IV': ['C', 'G', 'Am', 'F'],
            'I-vi-IV-V': ['C', 'Am', 'F', 'G'],
            'vi-IV-I-V': ['Am', 'F', 'C', 'G'],
            'I-IV-V': ['C', 'F', 'G']
        }
        
        detected_patterns = []
        
        # パターン検出（簡易版）
        chord_names = [c['chord'] for c in self.chord_sequence]
        for pattern_name, pattern in common_progressions.items():
            # パターンの長さ
            pattern_len = len(pattern)
            
            # スライディングウィンドウで検索
            for i in range(len(chord_names) - pattern_len + 1):
                window = chord_names[i:i + pattern_len]
                
                # 相対的な一致を確認（移調を考慮）
                if self._is_pattern_match(window, pattern):
                    detected_patterns.append({
                        'pattern': pattern_name,
                        'position': i,
                        'chords': window
                    })
        
        analysis = {
            'chord_counts': chord_counts,
            'transitions': transitions,
            'detected_patterns': detected_patterns,
            'most_common_chord': max(chord_counts, key=chord_counts.get),
            'most_common_transition': max(transitions, key=transitions.get) if transitions else None
        }
        
        return analysis
    
    def _is_pattern_match(self, 
                         actual_chords: List[str], 
                         pattern: List[str]) -&gt; bool:
        """
        コードパターンがマッチするかチェック（移調を考慮）
        """
        # 簡易的な実装（同じルート間隔をチェック）
        if len(actual_chords) != len(pattern):
            return False
            
        # TODO: より高度なパターンマッチング実装
        # 現在は完全一致のみ
        return actual_chords == pattern
    
    def export_to_midi(self, output_path: str):
        """
        検出したコード進行をMIDIファイルとしてエクスポート
        """
        if self.chord_sequence is None:
            raise ValueError("コードが検出されていません")
            
        # music21を使用してMIDI作成
        stream = m21.stream.Stream()
        
        for chord_info in self.chord_sequence:
            chord_name = chord_info['chord']
            duration = chord_info.get('duration', 0.5)
            
            # コード名からmusic21のChordオブジェクトを作成
            try:
                chord_obj = m21.harmony.ChordSymbol(chord_name)
                chord_obj.duration = m21.duration.Duration(duration * 4)  # 4分音符基準
                stream.append(chord_obj)
            except:
                pass
        
        # MIDIファイルとして保存
        stream.write('midi', fp=output_path)
        print(f"✅ MIDIファイル保存完了: {output_path}")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

from scipy.stats import mode

import music21 as m21

class ChordProgressionAnalyzer:

"""

コード進行を分析するクラス

"""

def __init__(self, audio_processor: AudioProcessor):

self.processor = audio_processor

self.chroma = None

self.chord_sequence = None

self.chord_templates = self._create_chord_templates()

def _create_chord_templates(self) -> Dict[str, np.ndarray]:

"""

コードテンプレートを作成

"""

templates = {}

# メジャーコード

for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F',

'F#', 'G', 'G#', 'A', 'A#', 'B']):

template = np.zeros(12)

template[i] = 1.0 # ルート

template[(i + 4) % 12] = 0.8 # 長3度

template[(i + 7) % 12] = 0.8 # 完全5度

templates[root] = template

# マイナーコード

for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F',

'F#', 'G', 'G#', 'A', 'A#', 'B']):

template = np.zeros(12)

template[i] = 1.0 # ルート

template[(i + 3) % 12] = 0.8 # 短3度

template[(i + 7) % 12] = 0.8 # 完全5度

templates[root + 'm'] = template

# 7thコード

for i, root in enumerate(['C', 'C#', 'D', 'D#', 'E', 'F',

'F#', 'G', 'G#', 'A', 'A#', 'B']):

template = np.zeros(12)

template[i] = 1.0 # ルート

template[(i + 4) % 12] = 0.8 # 長3度

template[(i + 7) % 12] = 0.8 # 完全5度

template[(i + 10) % 12] = 0.6 # 短7度

templates[root + '7'] = template

return templates

def extract_chromagram(self, hop_length: int = 512) -> np.ndarray:

"""

クロマグラムを抽出

"""

if self.processor.audio_data is None:

raise ValueError("音声データが読み込まれていません")

# クロマグラムの計算

self.chroma = librosa.feature.chroma_cqt(

y=self.processor.audio_data,

sr=self.processor.sr,

hop_length=hop_length,

n_chroma=12

)

print(f"✅ クロマグラム抽出完了: {self.chroma.shape}")

return self.chroma

def detect_chords(self,

segment_length: float = 0.5,

smoothing: bool = True) -> List[Dict]:

"""

コードを検出

Parameters:

segment_length: コード検出の時間単位（秒）

smoothing: スムージングを適用するか

Returns:

chord_sequence: コードシーケンス

"""

if self.chroma is None:

self.extract_chromagram()

hop_length = 512

sr = self.processor.sr

frame_rate = sr / hop_length

# セグメントごとにコードを検出

segment_frames = int(segment_length * frame_rate)

n_segments = self.chroma.shape[1] // segment_frames

chord_sequence = []

for i in range(n_segments):

start_frame = i * segment_frames

end_frame = min((i + 1) * segment_frames, self.chroma.shape[1])

# セグメント内のクロマベクトルを平均化

segment_chroma = np.mean(

self.chroma[:, start_frame:end_frame], axis=1

)

# 正規化

if np.sum(segment_chroma) > 0:

segment_chroma = segment_chroma / np.sum(segment_chroma)

# テンプレートマッチング

best_chord = None

best_score = -1

for chord_name, template in self.chord_templates.items():

# コサイン類似度

score = np.dot(segment_chroma, template) / (

np.linalg.norm(segment_chroma) * np.linalg.norm(template)

)

if score > best_score:

best_score = score

best_chord = chord_name

chord_info = {

'chord': best_chord,

'confidence': best_score,

'start_time': i * segment_length,

'end_time': (i + 1) * segment_length,

'chroma': segment_chroma

}

chord_sequence.append(chord_info)

# スムージング（短時間の変化を除去）

if smoothing:

chord_sequence = self._smooth_chord_sequence(chord_sequence)

self.chord_sequence = chord_sequence

print(f"✅ コード検出完了: {len(chord_sequence)} コード")

return chord_sequence

def _smooth_chord_sequence(self,

chord_sequence: List[Dict],

min_duration: float = 1.0) -> List[Dict]:

"""

コードシーケンスをスムージング

"""

smoothed = []

current_chord = None

current_start = 0

for i, chord_info in enumerate(chord_sequence):

if current_chord is None:

current_chord = chord_info['chord']

current_start = chord_info['start_time']

elif chord_info['chord'] != current_chord:

# コードが変わった場合

duration = chord_info['start_time'] - current_start

if duration >= min_duration:

# 十分な長さがある場合は確定

smoothed.append({

'chord': current_chord,

'start_time': current_start,

'end_time': chord_info['start_time'],

'duration': duration

})

current_chord = chord_info['chord']

current_start = chord_info['start_time']

else:

# 短すぎる場合は無視

pass

# 最後のコード

if current_chord:

smoothed.append({

'chord': current_chord,

'start_time': current_start,

'end_time': chord_sequence[-1]['end_time'],

'duration': chord_sequence[-1]['end_time'] - current_start

})

return smoothed

def analyze_chord_progression_patterns(self) -> Dict:

"""

コード進行パターンを分析

"""

if self.chord_sequence is None:

raise ValueError("コードが検出されていません")

# コード進行の統計

chord_counts = {}

transitions = {}

for i, chord_info in enumerate(self.chord_sequence):

chord = chord_info['chord']

# コードの出現回数

chord_counts[chord] = chord_counts.get(chord, 0) + 1

# コード遷移

if i < len(self.chord_sequence) - 1: next_chord = self.chord_sequence[i + 1]['chord'] transition = f"{chord} -> {next_chord}"

transitions[transition] = transitions.get(transition, 0) + 1

# よく使われる進行パターンの検出

common_progressions = {

'ii-V-I': ['Dm', 'G7', 'C'],

'I-V-vi-IV': ['C', 'G', 'Am', 'F'],

'I-vi-IV-V': ['C', 'Am', 'F', 'G'],

'vi-IV-I-V': ['Am', 'F', 'C', 'G'],

'I-IV-V': ['C', 'F', 'G']

}

detected_patterns = []

# パターン検出（簡易版）

chord_names = [c['chord'] for c in self.chord_sequence]

for pattern_name, pattern in common_progressions.items():

# パターンの長さ

pattern_len = len(pattern)

# スライディングウィンドウで検索

for i in range(len(chord_names) - pattern_len + 1):

window = chord_names[i:i + pattern_len]

# 相対的な一致を確認（移調を考慮）

if self._is_pattern_match(window, pattern):

detected_patterns.append({

'pattern': pattern_name,

'position': i,

'chords': window

})

analysis = {

'chord_counts': chord_counts,

'transitions': transitions,

'detected_patterns': detected_patterns,

'most_common_chord': max(chord_counts, key=chord_counts.get),

'most_common_transition': max(transitions, key=transitions.get) if transitions else None

}

return analysis

def _is_pattern_match(self,

actual_chords: List[str],

pattern: List[str]) -> bool:

"""

コードパターンがマッチするかチェック（移調を考慮）

"""

# 簡易的な実装（同じルート間隔をチェック）

if len(actual_chords) != len(pattern):

return False

# TODO: より高度なパターンマッチング実装

# 現在は完全一致のみ

return actual_chords == pattern

def export_to_midi(self, output_path: str):

"""

検出したコード進行をMIDIファイルとしてエクスポート

"""

if self.chord_sequence is None:

raise ValueError("コードが検出されていません")

# music21を使用してMIDI作成

stream = m21.stream.Stream()

for chord_info in self.chord_sequence:

chord_name = chord_info['chord']

duration = chord_info.get('duration', 0.5)

# コード名からmusic21のChordオブジェクトを作成

try:

chord_obj = m21.harmony.ChordSymbol(chord_name)

chord_obj.duration = m21.duration.Duration(duration * 4) # 4分音符基準

stream.append(chord_obj)

except:

pass

# MIDIファイルとして保存

stream.write('midi', fp=output_path)

print(f"✅ MIDIファイル保存完了: {output_path}")

リズム・ビート分析

1. テンポとビート検出

🥁 リズム分析クラス


class RhythmAnalyzer:
    """
    リズムとビートを分析するクラス
    """
    def __init__(self, audio_processor: AudioProcessor):
        self.processor = audio_processor
        self.tempo = None
        self.beats = None
        self.onset_envelope = None
        
    def estimate_tempo(self, method: str = 'ellis') -&gt; float:
        """
        テンポを推定
        
        Parameters:
            method: テンポ推定手法（'ellis', 'degara', 'percival'）
            
        Returns:
            tempo: 推定されたBPM
        """
        if self.processor.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        # オンセット強度の計算
        self.onset_envelope = librosa.onset.onset_strength(
            y=self.processor.audio_data,
            sr=self.processor.sr
        )
        
        # テンポ推定
        if method == 'ellis':
            tempo, _ = librosa.beat.beat_track(
                onset_envelope=self.onset_envelope,
                sr=self.processor.sr
            )
        else:
            # より高精度な推定
            tempogram = librosa.feature.tempogram(
                onset_envelope=self.onset_envelope,
                sr=self.processor.sr
            )
            tempo = librosa.beat.tempo(
                onset_envelope=self.onset_envelope,
                sr=self.processor.sr
            )[0]
        
        self.tempo = tempo
        print(f"✅ テンポ推定完了: {tempo:.1f} BPM")
        
        return tempo
    
    def detect_beats(self, trim: bool = True) -&gt; np.ndarray:
        """
        ビート位置を検出
        
        Parameters:
            trim: ビート位置を音声の長さに合わせてトリミングするか
            
        Returns:
            beat_times: ビート位置（秒）
        """
        if self.onset_envelope is None:
            self.estimate_tempo()
            
        # ビートトラッキング
        tempo, beats = librosa.beat.beat_track(
            onset_envelope=self.onset_envelope,
            sr=self.processor.sr,
            trim=trim
        )
        
        # フレームを時間に変換
        beat_times = librosa.frames_to_time(
            beats,
            sr=self.processor.sr,
            hop_length=512
        )
        
        self.beats = beat_times
        print(f"✅ ビート検出完了: {len(beat_times)} ビート")
        
        return beat_times
    
    def analyze_rhythm_patterns(self, 
                              bar_length: int = 4,
                              pattern_length: int = 16) -&gt; Dict:
        """
        リズムパターンを分析
        
        Parameters:
            bar_length: 1小節のビート数
            pattern_length: パターンの長さ（ビート数）
            
        Returns:
            rhythm_analysis: リズム分析結果
        """
        if self.beats is None:
            self.detect_beats()
            
        # ビート間隔の計算
        beat_intervals = np.diff(self.beats)
        mean_interval = np.mean(beat_intervals)
        std_interval = np.std(beat_intervals)
        
        # リズムの安定性
        rhythm_stability = 1.0 - (std_interval / mean_interval)
        
        # 小節の検出
        bars = []
        for i in range(0, len(self.beats) - bar_length, bar_length):
            bar_beats = self.beats[i:i + bar_length]
            bars.append({
                'start_time': bar_beats[0],
                'end_time': bar_beats[-1],
                'beats': bar_beats
            })
        
        # パターンの抽出（簡易版）
        patterns = self._extract_rhythm_patterns(
            beat_intervals, pattern_length
        )
        
        # グルーヴ分析
        groove_analysis = self._analyze_groove(beat_intervals)
        
        analysis = {
            'tempo': self.tempo,
            'num_beats': len(self.beats),
            'mean_beat_interval': mean_interval,
            'rhythm_stability': rhythm_stability,
            'num_bars': len(bars),
            'patterns': patterns,
            'groove': groove_analysis
        }
        
        return analysis
    
    def _extract_rhythm_patterns(self, 
                               beat_intervals: np.ndarray,
                               pattern_length: int) -&gt; List[Dict]:
        """
        リズムパターンを抽出
        """
        patterns = []
        
        # 量子化（16分音符グリッドに）
        sixteenth_duration = 60.0 / self.tempo / 4  # 16分音符の長さ
        
        # パターンの検出（簡易実装）
        for i in range(0, len(beat_intervals) - pattern_length):
            pattern_intervals = beat_intervals[i:i + pattern_length]
            
            # パターンを正規化
            normalized_pattern = pattern_intervals / sixteenth_duration
            quantized_pattern = np.round(normalized_pattern).astype(int)
            
            patterns.append({
                'position': i,
                'pattern': quantized_pattern.tolist(),
                'confidence': 1.0  # TODO: 信頼度の計算
            })
        
        return patterns
    
    def _analyze_groove(self, beat_intervals: np.ndarray) -&gt; Dict:
        """
        グルーヴ（リズムの「ノリ」）を分析
        """
        # スウィング率の計算
        # 偶数ビートと奇数ビートの比率
        even_intervals = beat_intervals[::2]
        odd_intervals = beat_intervals[1::2]
        
        if len(even_intervals) &gt; 0 and len(odd_intervals) &gt; 0:
            swing_ratio = np.mean(odd_intervals) / np.mean(even_intervals)
        else:
            swing_ratio = 1.0
        
        # タイトネス（リズムの正確さ）
        tightness = 1.0 - np.std(beat_intervals) / np.mean(beat_intervals)
        
        return {
            'swing_ratio': swing_ratio,
            'tightness': tightness,
            'groove_type': self._classify_groove(swing_ratio)
        }
    
    def _classify_groove(self, swing_ratio: float) -&gt; str:
        """
        グルーヴタイプを分類
        """
        if swing_ratio &lt; 0.95:
            return 'rushed'
        elif swing_ratio &lt; 1.05:
            return 'straight'
        elif swing_ratio &lt; 1.3:
            return 'light_swing'
        elif swing_ratio &lt; 1.5:
            return 'medium_swing'
        else:
            return 'heavy_swing'
    
    def visualize_rhythm(self, save_path: Optional[str] = None):
        """
        リズムパターンを可視化
        """
        if self.beats is None or self.onset_envelope is None:
            raise ValueError("リズム分析が実行されていません")
            
        fig, axes = plt.subplots(3, 1, figsize=(12, 8))
        
        # 1. オンセット強度とビート位置
        ax1 = axes[0]
        time_frames = librosa.frames_to_time(
            np.arange(len(self.onset_envelope)),
            sr=self.processor.sr,
            hop_length=512
        )
        ax1.plot(time_frames, self.onset_envelope, label='Onset Strength')
        ax1.vlines(self.beats, 0, self.onset_envelope.max(), 
                  color='r', alpha=0.6, label='Beats')
        ax1.set_xlabel('Time (s)')
        ax1.set_ylabel('Onset Strength')
        ax1.set_title(f'Beat Tracking (Tempo: {self.tempo:.1f} BPM)')
        ax1.legend()
        
        # 2. テンポグラム
        ax2 = axes[1]
        tempogram = librosa.feature.tempogram(
            onset_envelope=self.onset_envelope,
            sr=self.processor.sr
        )
        librosa.display.specshow(
            tempogram,
            sr=self.processor.sr,
            x_axis='time',
            y_axis='tempo',
            ax=ax2
        )
        ax2.set_title('Tempogram')
        ax2.axhline(self.tempo, color='w', linestyle='--', alpha=0.8, 
                   label=f'{self.tempo:.1f} BPM')
        ax2.legend()
        
        # 3. ビート間隔のヒストグラム
        ax3 = axes[2]
        beat_intervals = np.diff(self.beats)
        ax3.hist(beat_intervals, bins=30, alpha=0.7, edgecolor='black')
        ax3.axvline(np.mean(beat_intervals), color='r', linestyle='--', 
                   label=f'Mean: {np.mean(beat_intervals):.3f}s')
        ax3.set_xlabel('Beat Interval (s)')
        ax3.set_ylabel('Count')
        ax3.set_title('Beat Interval Distribution')
        ax3.legend()
        
        plt.tight_layout()
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
            print(f"✅ リズム分析図を保存: {save_path}")
        
        plt.show()

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

class RhythmAnalyzer:

"""

リズムとビートを分析するクラス

"""

def __init__(self, audio_processor: AudioProcessor):

self.processor = audio_processor

self.tempo = None

self.beats = None

self.onset_envelope = None

def estimate_tempo(self, method: str = 'ellis') -> float:

"""

テンポを推定

Parameters:

method: テンポ推定手法（'ellis', 'degara', 'percival'）

Returns:

tempo: 推定されたBPM

"""

if self.processor.audio_data is None:

raise ValueError("音声データが読み込まれていません")

# オンセット強度の計算

self.onset_envelope = librosa.onset.onset_strength(

y=self.processor.audio_data,

sr=self.processor.sr

)

# テンポ推定

if method == 'ellis':

tempo, _ = librosa.beat.beat_track(

onset_envelope=self.onset_envelope,

sr=self.processor.sr

)

else:

# より高精度な推定

tempogram = librosa.feature.tempogram(

onset_envelope=self.onset_envelope,

sr=self.processor.sr

)

tempo = librosa.beat.tempo(

onset_envelope=self.onset_envelope,

sr=self.processor.sr

)[0]

self.tempo = tempo

print(f"✅ テンポ推定完了: {tempo:.1f} BPM")

return tempo

def detect_beats(self, trim: bool = True) -> np.ndarray:

"""

ビート位置を検出

Parameters:

trim: ビート位置を音声の長さに合わせてトリミングするか

Returns:

beat_times: ビート位置（秒）

"""

if self.onset_envelope is None:

self.estimate_tempo()

# ビートトラッキング

tempo, beats = librosa.beat.beat_track(

onset_envelope=self.onset_envelope,

sr=self.processor.sr,

trim=trim

)

# フレームを時間に変換

beat_times = librosa.frames_to_time(

beats,

sr=self.processor.sr,

hop_length=512

)

self.beats = beat_times

print(f"✅ ビート検出完了: {len(beat_times)} ビート")

return beat_times

def analyze_rhythm_patterns(self,

bar_length: int = 4,

pattern_length: int = 16) -> Dict:

"""

リズムパターンを分析

Parameters:

bar_length: 1小節のビート数

pattern_length: パターンの長さ（ビート数）

Returns:

rhythm_analysis: リズム分析結果

"""

if self.beats is None:

self.detect_beats()

# ビート間隔の計算

beat_intervals = np.diff(self.beats)

mean_interval = np.mean(beat_intervals)

std_interval = np.std(beat_intervals)

# リズムの安定性

rhythm_stability = 1.0 - (std_interval / mean_interval)

# 小節の検出

bars = []

for i in range(0, len(self.beats) - bar_length, bar_length):

bar_beats = self.beats[i:i + bar_length]

bars.append({

'start_time': bar_beats[0],

'end_time': bar_beats[-1],

'beats': bar_beats

})

# パターンの抽出（簡易版）

patterns = self._extract_rhythm_patterns(

beat_intervals, pattern_length

)

# グルーヴ分析

groove_analysis = self._analyze_groove(beat_intervals)

analysis = {

'tempo': self.tempo,

'num_beats': len(self.beats),

'mean_beat_interval': mean_interval,

'rhythm_stability': rhythm_stability,

'num_bars': len(bars),

'patterns': patterns,

'groove': groove_analysis

}

return analysis

def _extract_rhythm_patterns(self,

beat_intervals: np.ndarray,

pattern_length: int) -> List[Dict]:

"""

リズムパターンを抽出

"""

patterns = []

# 量子化（16分音符グリッドに）

sixteenth_duration = 60.0 / self.tempo / 4 # 16分音符の長さ

# パターンの検出（簡易実装）

for i in range(0, len(beat_intervals) - pattern_length):

pattern_intervals = beat_intervals[i:i + pattern_length]

# パターンを正規化

normalized_pattern = pattern_intervals / sixteenth_duration

quantized_pattern = np.round(normalized_pattern).astype(int)

patterns.append({

'position': i,

'pattern': quantized_pattern.tolist(),

'confidence': 1.0 # TODO: 信頼度の計算

})

return patterns

def _analyze_groove(self, beat_intervals: np.ndarray) -> Dict:

"""

グルーヴ（リズムの「ノリ」）を分析

"""

# スウィング率の計算

# 偶数ビートと奇数ビートの比率

even_intervals = beat_intervals[::2]

odd_intervals = beat_intervals[1::2]

if len(even_intervals) > 0 and len(odd_intervals) > 0:

swing_ratio = np.mean(odd_intervals) / np.mean(even_intervals)

else:

swing_ratio = 1.0

# タイトネス（リズムの正確さ）

tightness = 1.0 - np.std(beat_intervals) / np.mean(beat_intervals)

return {

'swing_ratio': swing_ratio,

'tightness': tightness,

'groove_type': self._classify_groove(swing_ratio)

}

def _classify_groove(self, swing_ratio: float) -> str:

"""

グルーヴタイプを分類

"""

if swing_ratio < 0.95:

return 'rushed'

elif swing_ratio < 1.05:

return 'straight'

elif swing_ratio < 1.3:

return 'light_swing'

elif swing_ratio < 1.5:

return 'medium_swing'

else:

return 'heavy_swing'

def visualize_rhythm(self, save_path: Optional[str] = None):

"""

リズムパターンを可視化

"""

if self.beats is None or self.onset_envelope is None:

raise ValueError("リズム分析が実行されていません")

fig, axes = plt.subplots(3, 1, figsize=(12, 8))

# 1. オンセット強度とビート位置

ax1 = axes[0]

time_frames = librosa.frames_to_time(

np.arange(len(self.onset_envelope)),

sr=self.processor.sr,

hop_length=512

)

ax1.plot(time_frames, self.onset_envelope, label='Onset Strength')

ax1.vlines(self.beats, 0, self.onset_envelope.max(),

color='r', alpha=0.6, label='Beats')

ax1.set_xlabel('Time (s)')

ax1.set_ylabel('Onset Strength')

ax1.set_title(f'Beat Tracking (Tempo: {self.tempo:.1f} BPM)')

ax1.legend()

# 2. テンポグラム

ax2 = axes[1]

tempogram = librosa.feature.tempogram(

onset_envelope=self.onset_envelope,

sr=self.processor.sr

)

librosa.display.specshow(

tempogram,

sr=self.processor.sr,

x_axis='time',

y_axis='tempo',

ax=ax2

)

ax2.set_title('Tempogram')

ax2.axhline(self.tempo, color='w', linestyle='--', alpha=0.8,

label=f'{self.tempo:.1f} BPM')

ax2.legend()

# 3. ビート間隔のヒストグラム

ax3 = axes[2]

beat_intervals = np.diff(self.beats)

ax3.hist(beat_intervals, bins=30, alpha=0.7, edgecolor='black')

ax3.axvline(np.mean(beat_intervals), color='r', linestyle='--',

label=f'Mean: {np.mean(beat_intervals):.3f}s')

ax3.set_xlabel('Beat Interval (s)')

ax3.set_ylabel('Count')

ax3.set_title('Beat Interval Distribution')

ax3.legend()

plt.tight_layout()

if save_path:

plt.savefig(save_path, dpi=300, bbox_inches='tight')

print(f"✅ リズム分析図を保存: {save_path}")

plt.show()

音響特徴の詳細分析

1. スペクトル分析と音色特徴

🎵 音響特徴分析クラス


class AcousticAnalyzer:
    """
    音響特徴を詳細に分析するクラス
    """
    def __init__(self, audio_processor: AudioProcessor):
        self.processor = audio_processor
        self.features = {}
        
    def extract_all_features(self) -&gt; Dict:
        """
        包括的な音響特徴を抽出
        """
        if self.processor.audio_data is None:
            raise ValueError("音声データが読み込まれていません")
            
        audio = self.processor.audio_data
        sr = self.processor.sr
        
        print("音響特徴を抽出中...")
        
        # 時間領域特徴
        self.features['temporal'] = self._extract_temporal_features(audio)
        
        # 周波数領域特徴
        self.features['spectral'] = self._extract_spectral_features(audio, sr)
        
        # 音色特徴
        self.features['timbre'] = self._extract_timbre_features(audio, sr)
        
        # ダイナミクス特徴
        self.features['dynamics'] = self._extract_dynamics_features(audio, sr)
        
        # 調性特徴
        self.features['tonal'] = self._extract_tonal_features(audio, sr)
        
        print("✅ 音響特徴抽出完了")
        return self.features
    
    def _extract_temporal_features(self, audio: np.ndarray) -&gt; Dict:
        """
        時間領域の特徴を抽出
        """
        features = {}
        
        # RMS（Root Mean Square）エネルギー
        rms = librosa.feature.rms(y=audio)[0]
        features['rms_mean'] = np.mean(rms)
        features['rms_std'] = np.std(rms)
        features['rms_max'] = np.max(rms)
        
        # ゼロ交差率
        zcr = librosa.feature.zero_crossing_rate(audio)[0]
        features['zcr_mean'] = np.mean(zcr)
        features['zcr_std'] = np.std(zcr)
        
        # 自己相関
        autocorr = librosa.autocorrelate(audio)
        features['autocorr_max'] = np.max(autocorr[1:])  # 0ラグを除く
        
        return features
    
    def _extract_spectral_features(self, 
                                 audio: np.ndarray, 
                                 sr: int) -&gt; Dict:
        """
        周波数領域の特徴を抽出
        """
        features = {}
        
        # スペクトログラム
        D = librosa.stft(audio)
        magnitude = np.abs(D)
        
        # スペクトル中心
        spectral_centroid = librosa.feature.spectral_centroid(
            y=audio, sr=sr
        )[0]
        features['spectral_centroid_mean'] = np.mean(spectral_centroid)
        features['spectral_centroid_std'] = np.std(spectral_centroid)
        
        # スペクトル帯域幅
        spectral_bandwidth = librosa.feature.spectral_bandwidth(
            y=audio, sr=sr
        )[0]
        features['spectral_bandwidth_mean'] = np.mean(spectral_bandwidth)
        features['spectral_bandwidth_std'] = np.std(spectral_bandwidth)
        
        # スペクトルロールオフ
        spectral_rolloff = librosa.feature.spectral_rolloff(
            y=audio, sr=sr
        )[0]
        features['spectral_rolloff_mean'] = np.mean(spectral_rolloff)
        features['spectral_rolloff_std'] = np.std(spectral_rolloff)
        
        # スペクトルコントラスト
        spectral_contrast = librosa.feature.spectral_contrast(
            y=audio, sr=sr
        )
        for i in range(spectral_contrast.shape[0]):
            features[f'spectral_contrast_{i}_mean'] = np.mean(
                spectral_contrast[i]
            )
        
        # スペクトルフラットネス
        spectral_flatness = librosa.feature.spectral_flatness(y=audio)[0]
        features['spectral_flatness_mean'] = np.mean(spectral_flatness)
        features['spectral_flatness_std'] = np.std(spectral_flatness)
        
        return features
    
    def _extract_timbre_features(self, 
                               audio: np.ndarray, 
                               sr: int) -&gt; Dict:
        """
        音色特徴を抽出
        """
        features = {}
        
        # MFCC（メル周波数ケプストラム係数）
        mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)
        for i in range(mfcc.shape[0]):
            features[f'mfcc_{i}_mean'] = np.mean(mfcc[i])
            features[f'mfcc_{i}_std'] = np.std(mfcc[i])
        
        # デルタMFCC（時間微分）
        mfcc_delta = librosa.feature.delta(mfcc)
        for i in range(mfcc_delta.shape[0]):
            features[f'mfcc_delta_{i}_mean'] = np.mean(mfcc_delta[i])
        
        # メルスペクトログラム統計
        mel_spec = librosa.feature.melspectrogram(y=audio, sr=sr)
        mel_spec_db = librosa.power_to_db(mel_spec)
        features['mel_spec_mean'] = np.mean(mel_spec_db)
        features['mel_spec_std'] = np.std(mel_spec_db)
        
        return features
    
    def _extract_dynamics_features(self, 
                                 audio: np.ndarray, 
                                 sr: int) -&gt; Dict:
        """
        ダイナミクス特徴を抽出
        """
        features = {}
        
        # ダイナミックレンジ
        rms = librosa.feature.rms(y=audio)[0]
        features['dynamic_range'] = 20 * np.log10(
            np.max(rms) / (np.min(rms) + 1e-6)
        )
        
        # クレストファクター（ピーク対RMS比）
        peak = np.max(np.abs(audio))
        rms_value = np.sqrt(np.mean(audio**2))
        features['crest_factor'] = peak / (rms_value + 1e-6)
        
        # ラウドネス（簡易推定）
        # A特性フィルタを適用した場合のRMS
        # ここでは簡易的にRMSを使用
        features['loudness_estimate'] = np.mean(rms)
        
        # エンベロープの統計
        envelope = np.abs(librosa.stft(audio)).mean(axis=0)
        features['envelope_attack_time'] = self._estimate_attack_time(
            envelope, sr
        )
        features['envelope_decay_time'] = self._estimate_decay_time(
            envelope, sr
        )
        
        return features
    
    def _extract_tonal_features(self, 
                              audio: np.ndarray, 
                              sr: int) -&gt; Dict:
        """
        調性特徴を抽出
        """
        features = {}
        
        # クロマベクトル
        chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)
        chroma_mean = np.mean(chroma, axis=1)
        
        # 各音高クラスの強度
        pitch_classes = ['C', 'C#', 'D', 'D#', 'E', 'F', 
                        'F#', 'G', 'G#', 'A', 'A#', 'B']
        for i, pc in enumerate(pitch_classes):
            features[f'chroma_{pc}'] = chroma_mean[i]
        
        # 調性推定（簡易版）
        # Krumhansl-Schmucklerのキープロファイルとの相関
        key_profiles = {
            'C_major': [6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52, 
                       5.19, 2.39, 3.66, 2.29, 2.88],
            'A_minor': [6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54, 
                       4.75, 3.98, 2.69, 3.34, 3.17]
        }
        
        correlations = {}
        for key_name, profile in key_profiles.items():
            correlation = np.corrcoef(chroma_mean, profile)[0, 1]
            correlations[key_name] = correlation
        
        # 最も可能性の高い調
        estimated_key = max(correlations, key=correlations.get)
        features['estimated_key'] = estimated_key
        features['key_confidence'] = correlations[estimated_key]
        
        # トーナルセントロイド
        tonnetz = librosa.feature.tonnetz(y=audio, sr=sr)
        features['tonnetz_mean'] = np.mean(tonnetz)
        features['tonnetz_std'] = np.std(tonnetz)
        
        return features
    
    def _estimate_attack_time(self, 
                            envelope: np.ndarray, 
                            sr: int) -&gt; float:
        """
        アタックタイムを推定
        """
        # エンベロープの最大値の10%から90%までの時間
        max_val = np.max(envelope)
        start_idx = np.where(envelope &gt; 0.1 * max_val)[0]
        peak_idx = np.where(envelope &gt; 0.9 * max_val)[0]
        
        if len(start_idx) &gt; 0 and len(peak_idx) &gt; 0:
            attack_frames = peak_idx[0] - start_idx[0]
            attack_time = attack_frames * 512 / sr  # hop_length=512
            return attack_time
        else:
            return 0.0
    
    def _estimate_decay_time(self, 
                           envelope: np.ndarray, 
                           sr: int) -&gt; float:
        """
        ディケイタイムを推定
        """
        # ピークから60dB減衰するまでの時間（RT60の簡易版）
        max_idx = np.argmax(envelope)
        max_val = envelope[max_idx]
        
        # -60dBは1/1000
        threshold = max_val / 1000
        decay_idx = np.where(envelope[max_idx:] &lt; threshold)[0] if len(decay_idx) &gt; 0:
            decay_frames = decay_idx[0]
            decay_time = decay_frames * 512 / sr
            return decay_time
        else:
            return 0.0
    
    def create_feature_summary(self) -&gt; pd.DataFrame:
        """
        特徴量のサマリーをDataFrameとして作成
        """
        if not self.features:
            raise ValueError("特徴量が抽出されていません")
            
        # フラットな辞書に変換
        flat_features = {}
        for category, features in self.features.items():
            for feature_name, value in features.items():
                flat_features[f"{category}_{feature_name}"] = value
        
        # DataFrameに変換
        df = pd.DataFrame([flat_features])
        
        return df
    
    def visualize_features(self, save_path: Optional[str] = None):
        """
        音響特徴を可視化
        """
        if not self.features:
            raise ValueError("特徴量が抽出されていません")
            
        fig, axes = plt.subplots(2, 2, figsize=(15, 10))
        
        # 1. スペクトル特徴の時系列
        ax1 = axes[0, 0]
        audio = self.processor.audio_data
        sr = self.processor.sr
        
        spectral_centroid = librosa.feature.spectral_centroid(y=audio, sr=sr)[0]
        frames = range(len(spectral_centroid))
        t = librosa.frames_to_time(frames, sr=sr)
        
        ax1.plot(t, spectral_centroid, label='Spectral Centroid')
        ax1.set_xlabel('Time (s)')
        ax1.set_ylabel('Hz')
        ax1.set_title('Spectral Centroid over Time')
        ax1.legend()
        
        # 2. MFCC
        ax2 = axes[0, 1]
        mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)
        img = librosa.display.specshow(
            mfcc, sr=sr, x_axis='time', ax=ax2
        )
        ax2.set_title('MFCC')
        ax2.set_ylabel('MFCC Coefficients')
        plt.colorbar(img, ax=ax2)
        
        # 3. クロマグラム
        ax3 = axes[1, 0]
        chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)
        img = librosa.display.specshow(
            chroma, sr=sr, x_axis='time', y_axis='chroma', ax=ax3
        )
        ax3.set_title('Chromagram')
        plt.colorbar(img, ax=ax3)
        
        # 4. 特徴量の分布
        ax4 = axes[1, 1]
        
        # 主要な特徴量を選択
        selected_features = [
            'spectral_centroid_mean',
            'spectral_bandwidth_mean',
            'spectral_rolloff_mean',
            'rms_mean',
            'zcr_mean'
        ]
        
        values = []
        labels = []
        
        for feature in selected_features:
            for category in self.features.values():
                if feature in category:
                    values.append(category[feature])
                    labels.append(feature.replace('_mean', '').replace('_', ' ').title())
                    break
        
        # 正規化
        values = np.array(values)
        values_norm = (values - values.min()) / (values.max() - values.min())
        
        # レーダーチャート
        angles = np.linspace(0, 2 * np.pi, len(labels), endpoint=False)
        values_norm = np.concatenate((values_norm, [values_norm[0]]))  # 閉じる
        angles = np.concatenate((angles, [angles[0]]))
        
        ax4 = plt.subplot(2, 2, 4, projection='polar')
        ax4.plot(angles, values_norm, 'o-', linewidth=2)
        ax4.fill(angles, values_norm, alpha=0.25)
        ax4.set_xticks(angles[:-1])
        ax4.set_xticklabels(labels, size=8)
        ax4.set_ylim(0, 1)
        ax4.set_title('Normalized Feature Values', pad=20)
        
        plt.tight_layout()
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
            print(f"✅ 音響特徴分析図を保存: {save_path}")
        
        plt.show()

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

class AcousticAnalyzer:

"""

音響特徴を詳細に分析するクラス

"""

def __init__(self, audio_processor: AudioProcessor):

self.processor = audio_processor

self.features = {}

def extract_all_features(self) -> Dict:

"""

包括的な音響特徴を抽出

"""

if self.processor.audio_data is None:

raise ValueError("音声データが読み込まれていません")

audio = self.processor.audio_data

sr = self.processor.sr

print("音響特徴を抽出中...")

# 時間領域特徴

self.features['temporal'] = self._extract_temporal_features(audio)

# 周波数領域特徴

self.features['spectral'] = self._extract_spectral_features(audio, sr)

# 音色特徴

self.features['timbre'] = self._extract_timbre_features(audio, sr)

# ダイナミクス特徴

self.features['dynamics'] = self._extract_dynamics_features(audio, sr)

# 調性特徴

self.features['tonal'] = self._extract_tonal_features(audio, sr)

print("✅ 音響特徴抽出完了")

return self.features

def _extract_temporal_features(self, audio: np.ndarray) -> Dict:

"""

時間領域の特徴を抽出

"""

features = {}

# RMS（Root Mean Square）エネルギー

rms = librosa.feature.rms(y=audio)[0]

features['rms_mean'] = np.mean(rms)

features['rms_std'] = np.std(rms)

features['rms_max'] = np.max(rms)

# ゼロ交差率

zcr = librosa.feature.zero_crossing_rate(audio)[0]

features['zcr_mean'] = np.mean(zcr)

features['zcr_std'] = np.std(zcr)

# 自己相関

autocorr = librosa.autocorrelate(audio)

features['autocorr_max'] = np.max(autocorr[1:]) # 0ラグを除く

return features

def _extract_spectral_features(self,

audio: np.ndarray,

sr: int) -> Dict:

"""

周波数領域の特徴を抽出

"""

features = {}

# スペクトログラム

D = librosa.stft(audio)

magnitude = np.abs(D)

# スペクトル中心

spectral_centroid = librosa.feature.spectral_centroid(

y=audio, sr=sr

)[0]

features['spectral_centroid_mean'] = np.mean(spectral_centroid)

features['spectral_centroid_std'] = np.std(spectral_centroid)

# スペクトル帯域幅

spectral_bandwidth = librosa.feature.spectral_bandwidth(

y=audio, sr=sr

)[0]

features['spectral_bandwidth_mean'] = np.mean(spectral_bandwidth)

features['spectral_bandwidth_std'] = np.std(spectral_bandwidth)

# スペクトルロールオフ

spectral_rolloff = librosa.feature.spectral_rolloff(

y=audio, sr=sr

)[0]

features['spectral_rolloff_mean'] = np.mean(spectral_rolloff)

features['spectral_rolloff_std'] = np.std(spectral_rolloff)

# スペクトルコントラスト

spectral_contrast = librosa.feature.spectral_contrast(

y=audio, sr=sr

)

for i in range(spectral_contrast.shape[0]):

features[f'spectral_contrast_{i}_mean'] = np.mean(

spectral_contrast[i]

)

# スペクトルフラットネス

spectral_flatness = librosa.feature.spectral_flatness(y=audio)[0]

features['spectral_flatness_mean'] = np.mean(spectral_flatness)

features['spectral_flatness_std'] = np.std(spectral_flatness)

return features

def _extract_timbre_features(self,

audio: np.ndarray,

sr: int) -> Dict:

"""

音色特徴を抽出

"""

features = {}

# MFCC（メル周波数ケプストラム係数）

mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)

for i in range(mfcc.shape[0]):

features[f'mfcc_{i}_mean'] = np.mean(mfcc[i])

features[f'mfcc_{i}_std'] = np.std(mfcc[i])

# デルタMFCC（時間微分）

mfcc_delta = librosa.feature.delta(mfcc)

for i in range(mfcc_delta.shape[0]):

features[f'mfcc_delta_{i}_mean'] = np.mean(mfcc_delta[i])

# メルスペクトログラム統計

mel_spec = librosa.feature.melspectrogram(y=audio, sr=sr)

mel_spec_db = librosa.power_to_db(mel_spec)

features['mel_spec_mean'] = np.mean(mel_spec_db)

features['mel_spec_std'] = np.std(mel_spec_db)

return features

def _extract_dynamics_features(self,

audio: np.ndarray,

sr: int) -> Dict:

"""

ダイナミクス特徴を抽出

"""

features = {}

# ダイナミックレンジ

rms = librosa.feature.rms(y=audio)[0]

features['dynamic_range'] = 20 * np.log10(

np.max(rms) / (np.min(rms) + 1e-6)

)

# クレストファクター（ピーク対RMS比）

peak = np.max(np.abs(audio))

rms_value = np.sqrt(np.mean(audio**2))

features['crest_factor'] = peak / (rms_value + 1e-6)

# ラウドネス（簡易推定）

# A特性フィルタを適用した場合のRMS

# ここでは簡易的にRMSを使用

features['loudness_estimate'] = np.mean(rms)

# エンベロープの統計

envelope = np.abs(librosa.stft(audio)).mean(axis=0)

features['envelope_attack_time'] = self._estimate_attack_time(

envelope, sr

)

features['envelope_decay_time'] = self._estimate_decay_time(

envelope, sr

)

return features

def _extract_tonal_features(self,

audio: np.ndarray,

sr: int) -> Dict:

"""

調性特徴を抽出

"""

features = {}

# クロマベクトル

chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)

chroma_mean = np.mean(chroma, axis=1)

# 各音高クラスの強度

pitch_classes = ['C', 'C#', 'D', 'D#', 'E', 'F',

'F#', 'G', 'G#', 'A', 'A#', 'B']

for i, pc in enumerate(pitch_classes):

features[f'chroma_{pc}'] = chroma_mean[i]

# 調性推定（簡易版）

# Krumhansl-Schmucklerのキープロファイルとの相関

key_profiles = {

'C_major': [6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52,

5.19, 2.39, 3.66, 2.29, 2.88],

'A_minor': [6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54,

4.75, 3.98, 2.69, 3.34, 3.17]

}

correlations = {}

for key_name, profile in key_profiles.items():

correlation = np.corrcoef(chroma_mean, profile)[0, 1]

correlations[key_name] = correlation

# 最も可能性の高い調

estimated_key = max(correlations, key=correlations.get)

features['estimated_key'] = estimated_key

features['key_confidence'] = correlations[estimated_key]

# トーナルセントロイド

tonnetz = librosa.feature.tonnetz(y=audio, sr=sr)

features['tonnetz_mean'] = np.mean(tonnetz)

features['tonnetz_std'] = np.std(tonnetz)

return features

def _estimate_attack_time(self,

envelope: np.ndarray,

sr: int) -> float:

"""

アタックタイムを推定

"""

# エンベロープの最大値の10%から90%までの時間

max_val = np.max(envelope)

start_idx = np.where(envelope > 0.1 * max_val)[0]

peak_idx = np.where(envelope > 0.9 * max_val)[0]

if len(start_idx) > 0 and len(peak_idx) > 0:

attack_frames = peak_idx[0] - start_idx[0]

attack_time = attack_frames * 512 / sr # hop_length=512

return attack_time

else:

return 0.0

def _estimate_decay_time(self,

envelope: np.ndarray,

sr: int) -> float:

"""

ディケイタイムを推定

"""

# ピークから60dB減衰するまでの時間（RT60の簡易版）

max_idx = np.argmax(envelope)

max_val = envelope[max_idx]

# -60dBは1/1000

threshold = max_val / 1000

decay_idx = np.where(envelope[max_idx:] < threshold)[0] if len(decay_idx) > 0:

decay_frames = decay_idx[0]

decay_time = decay_frames * 512 / sr

return decay_time

else:

return 0.0

def create_feature_summary(self) -> pd.DataFrame:

"""

特徴量のサマリーをDataFrameとして作成

"""

if not self.features:

raise ValueError("特徴量が抽出されていません")

# フラットな辞書に変換

flat_features = {}

for category, features in self.features.items():

for feature_name, value in features.items():

flat_features[f"{category}_{feature_name}"] = value

# DataFrameに変換

df = pd.DataFrame([flat_features])

return df

def visualize_features(self, save_path: Optional[str] = None):

"""

音響特徴を可視化

"""

if not self.features:

raise ValueError("特徴量が抽出されていません")

fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# 1. スペクトル特徴の時系列

ax1 = axes[0, 0]

audio = self.processor.audio_data

sr = self.processor.sr

spectral_centroid = librosa.feature.spectral_centroid(y=audio, sr=sr)[0]

frames = range(len(spectral_centroid))

t = librosa.frames_to_time(frames, sr=sr)

ax1.plot(t, spectral_centroid, label='Spectral Centroid')

ax1.set_xlabel('Time (s)')

ax1.set_ylabel('Hz')

ax1.set_title('Spectral Centroid over Time')

ax1.legend()

# 2. MFCC

ax2 = axes[0, 1]

mfcc = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=13)

img = librosa.display.specshow(

mfcc, sr=sr, x_axis='time', ax=ax2

)

ax2.set_title('MFCC')

ax2.set_ylabel('MFCC Coefficients')

plt.colorbar(img, ax=ax2)

# 3. クロマグラム

ax3 = axes[1, 0]

chroma = librosa.feature.chroma_cqt(y=audio, sr=sr)

img = librosa.display.specshow(

chroma, sr=sr, x_axis='time', y_axis='chroma', ax=ax3

)

ax3.set_title('Chromagram')

plt.colorbar(img, ax=ax3)

# 4. 特徴量の分布

ax4 = axes[1, 1]

# 主要な特徴量を選択

selected_features = [

'spectral_centroid_mean',

'spectral_bandwidth_mean',

'spectral_rolloff_mean',

'rms_mean',

'zcr_mean'

]

values = []

labels = []

for feature in selected_features:

for category in self.features.values():

if feature in category:

values.append(category[feature])

labels.append(feature.replace('_mean', '').replace('_', ' ').title())

break

# 正規化

values = np.array(values)

values_norm = (values - values.min()) / (values.max() - values.min())

# レーダーチャート

angles = np.linspace(0, 2 * np.pi, len(labels), endpoint=False)

values_norm = np.concatenate((values_norm, [values_norm[0]])) # 閉じる

angles = np.concatenate((angles, [angles[0]]))

ax4 = plt.subplot(2, 2, 4, projection='polar')

ax4.plot(angles, values_norm, 'o-', linewidth=2)

ax4.fill(angles, values_norm, alpha=0.25)

ax4.set_xticks(angles[:-1])

ax4.set_xticklabels(labels, size=8)

ax4.set_ylim(0, 1)

ax4.set_title('Normalized Feature Values', pad=20)

plt.tight_layout()

if save_path:

plt.savefig(save_path, dpi=300, bbox_inches='tight')

print(f"✅ 音響特徴分析図を保存: {save_path}")

plt.show()

統合分析システムの構築

1. 包括的な楽曲分析クラス

🎯 統合分析システム


import json
from datetime import datetime
import os

class MusicAnalysisSystem:
    """
    すべての分析機能を統合したシステム
    """
    def __init__(self):
        self.audio_processor = None
        self.structure_analyzer = None
        self.chord_analyzer = None
        self.rhythm_analyzer = None
        self.acoustic_analyzer = None
        self.analysis_results = {}
        
    def analyze_file(self, 
                    filepath: str,
                    output_dir: str = 'analysis_output') -&gt; Dict:
        """
        音楽ファイルを包括的に分析
        
        Parameters:
            filepath: 分析する音楽ファイルのパス
            output_dir: 結果を保存するディレクトリ
            
        Returns:
            results: 分析結果の辞書
        """
        print(f"\n{'='*50}")
        print(f"楽曲分析開始: {os.path.basename(filepath)}")
        print(f"{'='*50}\n")
        
        # 出力ディレクトリの作成
        os.makedirs(output_dir, exist_ok=True)
        
        # 1. 音声ファイルの読み込み
        print("[1/5] 音声ファイル読み込み...")
        self.audio_processor = AudioProcessor()
        audio_data, sr = self.audio_processor.load_audio(filepath)
        
        if audio_data is None:
            raise ValueError("音声ファイルの読み込みに失敗しました")
        
        # 前処理
        audio_data = self.audio_processor.apply_preprocessing()
        
        # 2. 楽曲構造分析
        print("\n[2/5] 楽曲構造分析...")
        self.structure_analyzer = StructureAnalyzer(self.audio_processor)
        structure_features = self.structure_analyzer.extract_features()
        ssm = self.structure_analyzer.compute_self_similarity_matrix()
        structure_segments = self.structure_analyzer.detect_segments(ssm)
        
        # 構造可視化
        structure_plot_path = os.path.join(
            output_dir, 'structure_analysis.png'
        )
        self.structure_analyzer.visualize_structure(structure_plot_path)
        
        # 3. コード進行分析
        print("\n[3/5] コード進行分析...")
        self.chord_analyzer = ChordProgressionAnalyzer(self.audio_processor)
        chromagram = self.chord_analyzer.extract_chromagram()
        chord_sequence = self.chord_analyzer.detect_chords()
        chord_patterns = self.chord_analyzer.analyze_chord_progression_patterns()
        
        # MIDI出力
        midi_path = os.path.join(output_dir, 'detected_chords.mid')
        self.chord_analyzer.export_to_midi(midi_path)
        
        # 4. リズム分析
        print("\n[4/5] リズム・ビート分析...")
        self.rhythm_analyzer = RhythmAnalyzer(self.audio_processor)
        tempo = self.rhythm_analyzer.estimate_tempo()
        beats = self.rhythm_analyzer.detect_beats()
        rhythm_analysis = self.rhythm_analyzer.analyze_rhythm_patterns()
        
        # リズム可視化
        rhythm_plot_path = os.path.join(output_dir, 'rhythm_analysis.png')
        self.rhythm_analyzer.visualize_rhythm(rhythm_plot_path)
        
        # 5. 音響特徴分析
        print("\n[5/5] 音響特徴分析...")
        self.acoustic_analyzer = AcousticAnalyzer(self.audio_processor)
        acoustic_features = self.acoustic_analyzer.extract_all_features()
        feature_df = self.acoustic_analyzer.create_feature_summary()
        
        # 音響特徴可視化
        acoustic_plot_path = os.path.join(output_dir, 'acoustic_features.png')
        self.acoustic_analyzer.visualize_features(acoustic_plot_path)
        
        # 特徴量をCSVで保存
        feature_csv_path = os.path.join(output_dir, 'acoustic_features.csv')
        feature_df.to_csv(feature_csv_path, index=False)
        
        # 結果の統合
        self.analysis_results = {
            'file_info': {
                'filepath': filepath,
                'filename': os.path.basename(filepath),
                'duration': self.audio_processor.duration,
                'sample_rate': sr,
                'analysis_date': datetime.now().isoformat()
            },
            'structure': {
                'segments': structure_segments['segments'],
                'num_sections': len(structure_segments['segments'])
            },
            'harmony': {
                'chord_sequence': [
                    {
                        'chord': c['chord'],
                        'start_time': c['start_time'],
                        'end_time': c['end_time']
                    } for c in chord_sequence[:20]  # 最初の20コード
                ],
                'chord_statistics': chord_patterns,
                'estimated_key': acoustic_features['tonal'].get(
                    'estimated_key', 'Unknown'
                )
            },
            'rhythm': {
                'tempo': tempo,
                'rhythm_stability': rhythm_analysis['rhythm_stability'],
                'groove_type': rhythm_analysis['groove']['groove_type'],
                'num_beats': rhythm_analysis['num_beats'],
                'num_bars': rhythm_analysis['num_bars']
            },
            'acoustic': {
                'dynamic_range': acoustic_features['dynamics']['dynamic_range'],
                'spectral_centroid': acoustic_features['spectral'][
                    'spectral_centroid_mean'
                ],
                'key_features': {
                    k: v for k, v in acoustic_features['spectral'].items() 
                    if 'mean' in k
                }
            }
        }
        
        # 結果をJSONで保存
        json_path = os.path.join(output_dir, 'analysis_results.json')
        with open(json_path, 'w', encoding='utf-8') as f:
            json.dump(self.analysis_results, f, indent=2, ensure_ascii=False)
        
        # サマリーレポートの生成
        self._generate_summary_report(output_dir)
        
        print(f"\n{'='*50}")
        print(f"✅ 分析完了！")
        print(f"結果は '{output_dir}' に保存されました")
        print(f"{'='*50}\n")
        
        return self.analysis_results
    
    def _generate_summary_report(self, output_dir: str):
        """
        分析結果のサマリーレポートを生成
        """
        report_path = os.path.join(output_dir, 'analysis_report.txt')
        
        with open(report_path, 'w', encoding='utf-8') as f:
            f.write("=" * 60 + "\n")
            f.write("楽曲分析レポート\n")
            f.write("=" * 60 + "\n\n")
            
            # ファイル情報
            f.write("【ファイル情報】\n")
            f.write(f"ファイル名: {self.analysis_results['file_info']['filename']}\n")
            f.write(f"長さ: {self.analysis_results['file_info']['duration']:.2f} 秒\n")
            f.write(f"分析日時: {self.analysis_results['file_info']['analysis_date']}\n")
            f.write("\n")
            
            # 構造分析
            f.write("【楽曲構造】\n")
            f.write(f"セクション数: {self.analysis_results['structure']['num_sections']}\n")
            for i, seg in enumerate(self.analysis_results['structure']['segments'][:5]):
                f.write(f"  - {seg['label']}: "
                       f"{seg['start']:.1f}s - {seg['end']:.1f}s\n")
            if len(self.analysis_results['structure']['segments']) &gt; 5:
                f.write("  ...\n")
            f.write("\n")
            
            # ハーモニー分析
            f.write("【ハーモニー分析】\n")
            f.write(f"推定調: {self.analysis_results['harmony']['estimated_key']}\n")
            f.write(f"最頻出コード: "
                   f"{self.analysis_results['harmony']['chord_statistics']['most_common_chord']}\n")
            f.write(f"最頻出進行: "
                   f"{self.analysis_results['harmony']['chord_statistics']['most_common_transition']}\n")
            f.write("\n")
            
            # リズム分析
            f.write("【リズム分析】\n")
            f.write(f"テンポ: {self.analysis_results['rhythm']['tempo']:.1f} BPM\n")
            f.write(f"リズム安定性: "
                   f"{self.analysis_results['rhythm']['rhythm_stability']:.2%}\n")
            f.write(f"グルーヴタイプ: {self.analysis_results['rhythm']['groove_type']}\n")
            f.write(f"総ビート数: {self.analysis_results['rhythm']['num_beats']}\n")
            f.write(f"小節数: {self.analysis_results['rhythm']['num_bars']}\n")
            f.write("\n")
            
            # 音響特性
            f.write("【音響特性】\n")
            f.write(f"ダイナミックレンジ: "
                   f"{self.analysis_results['acoustic']['dynamic_range']:.1f} dB\n")
            f.write(f"スペクトル中心: "
                   f"{self.analysis_results['acoustic']['spectral_centroid']:.1f} Hz\n")
            f.write("\n")
            
            f.write("=" * 60 + "\n")
        
        print(f"✅ 分析レポート生成: {report_path}")
    
    def batch_analyze(self, 
                     file_list: List[str],
                     output_base_dir: str = 'batch_analysis') -&gt; Dict:
        """
        複数ファイルのバッチ分析
        """
        results = {}
        
        for i, filepath in enumerate(file_list):
            print(f"\n[{i+1}/{len(file_list)}] {os.path.basename(filepath)} を分析中...")
            
            # 各ファイル用の出力ディレクトリ
            filename_base = os.path.splitext(os.path.basename(filepath))[0]
            output_dir = os.path.join(output_base_dir, filename_base)
            
            try:
                result = self.analyze_file(filepath, output_dir)
                results[filepath] = result
            except Exception as e:
                print(f"❌ エラーが発生しました: {e}")
                results[filepath] = {'error': str(e)}
        
        # バッチ分析の統計
        self._generate_batch_statistics(results, output_base_dir)
        
        return results
    
    def _generate_batch_statistics(self, 
                                 results: Dict,
                                 output_dir: str):
        """
        バッチ分析の統計情報を生成
        """
        stats_path = os.path.join(output_dir, 'batch_statistics.json')
        
        # 成功した分析のみを集計
        successful_results = {
            k: v for k, v in results.items() if 'error' not in v
        }
        
        if not successful_results:
            return
        
        # 統計情報の計算
        tempos = [r['rhythm']['tempo'] for r in successful_results.values()]
        durations = [r['file_info']['duration'] for r in successful_results.values()]
        
        statistics = {
            'total_files': len(results),
            'successful_analyses': len(successful_results),
            'failed_analyses': len(results) - len(successful_results),
            'average_tempo': np.mean(tempos),
            'tempo_std': np.std(tempos),
            'average_duration': np.mean(durations),
            'total_duration': sum(durations)
        }
        
        with open(stats_path, 'w') as f:
            json.dump(statistics, f, indent=2)
        
        print(f"\n✅ バッチ統計生成: {stats_path}")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

import json

from datetime import datetime

import os

class MusicAnalysisSystem:

"""

すべての分析機能を統合したシステム

"""

def __init__(self):

self.audio_processor = None

self.structure_analyzer = None

self.chord_analyzer = None

self.rhythm_analyzer = None

self.acoustic_analyzer = None

self.analysis_results = {}

def analyze_file(self,

filepath: str,

output_dir: str = 'analysis_output') -> Dict:

"""

音楽ファイルを包括的に分析

Parameters:

filepath: 分析する音楽ファイルのパス

output_dir: 結果を保存するディレクトリ

Returns:

results: 分析結果の辞書

"""

print(f"\n{'='*50}")

print(f"楽曲分析開始: {os.path.basename(filepath)}")

print(f"{'='*50}\n")

# 出力ディレクトリの作成

os.makedirs(output_dir, exist_ok=True)

# 1. 音声ファイルの読み込み

print("[1/5] 音声ファイル読み込み...")

self.audio_processor = AudioProcessor()

audio_data, sr = self.audio_processor.load_audio(filepath)

if audio_data is None:

raise ValueError("音声ファイルの読み込みに失敗しました")

# 前処理

audio_data = self.audio_processor.apply_preprocessing()

# 2. 楽曲構造分析

print("\n[2/5] 楽曲構造分析...")

self.structure_analyzer = StructureAnalyzer(self.audio_processor)

structure_features = self.structure_analyzer.extract_features()

ssm = self.structure_analyzer.compute_self_similarity_matrix()

structure_segments = self.structure_analyzer.detect_segments(ssm)

# 構造可視化

structure_plot_path = os.path.join(

output_dir, 'structure_analysis.png'

)

self.structure_analyzer.visualize_structure(structure_plot_path)

# 3. コード進行分析

print("\n[3/5] コード進行分析...")

self.chord_analyzer = ChordProgressionAnalyzer(self.audio_processor)

chromagram = self.chord_analyzer.extract_chromagram()

chord_sequence = self.chord_analyzer.detect_chords()

chord_patterns = self.chord_analyzer.analyze_chord_progression_patterns()

# MIDI出力

midi_path = os.path.join(output_dir, 'detected_chords.mid')

self.chord_analyzer.export_to_midi(midi_path)

# 4. リズム分析

print("\n[4/5] リズム・ビート分析...")

self.rhythm_analyzer = RhythmAnalyzer(self.audio_processor)

tempo = self.rhythm_analyzer.estimate_tempo()

beats = self.rhythm_analyzer.detect_beats()

rhythm_analysis = self.rhythm_analyzer.analyze_rhythm_patterns()

# リズム可視化

rhythm_plot_path = os.path.join(output_dir, 'rhythm_analysis.png')

self.rhythm_analyzer.visualize_rhythm(rhythm_plot_path)

# 5. 音響特徴分析

print("\n[5/5] 音響特徴分析...")

self.acoustic_analyzer = AcousticAnalyzer(self.audio_processor)

acoustic_features = self.acoustic_analyzer.extract_all_features()

feature_df = self.acoustic_analyzer.create_feature_summary()

# 音響特徴可視化

acoustic_plot_path = os.path.join(output_dir, 'acoustic_features.png')

self.acoustic_analyzer.visualize_features(acoustic_plot_path)

# 特徴量をCSVで保存

feature_csv_path = os.path.join(output_dir, 'acoustic_features.csv')

feature_df.to_csv(feature_csv_path, index=False)

# 結果の統合

self.analysis_results = {

'file_info': {

'filepath': filepath,

'filename': os.path.basename(filepath),

'duration': self.audio_processor.duration,

'sample_rate': sr,

'analysis_date': datetime.now().isoformat()

'structure': {

'segments': structure_segments['segments'],

'num_sections': len(structure_segments['segments'])

'harmony': {

'chord_sequence': [

{

'chord': c['chord'],

'start_time': c['start_time'],

'end_time': c['end_time']

} for c in chord_sequence[:20] # 最初の20コード

'chord_statistics': chord_patterns,

'estimated_key': acoustic_features['tonal'].get(

'estimated_key', 'Unknown'

)

'rhythm': {

'tempo': tempo,

'rhythm_stability': rhythm_analysis['rhythm_stability'],

'groove_type': rhythm_analysis['groove']['groove_type'],

'num_beats': rhythm_analysis['num_beats'],

'num_bars': rhythm_analysis['num_bars']

'acoustic': {

'dynamic_range': acoustic_features['dynamics']['dynamic_range'],

'spectral_centroid': acoustic_features['spectral'][

'spectral_centroid_mean'

'key_features': {

k: v for k, v in acoustic_features['spectral'].items()

if 'mean' in k

}

# 結果をJSONで保存

json_path = os.path.join(output_dir, 'analysis_results.json')

with open(json_path, 'w', encoding='utf-8') as f:

json.dump(self.analysis_results, f, indent=2, ensure_ascii=False)

# サマリーレポートの生成

self._generate_summary_report(output_dir)

print(f"\n{'='*50}")

print(f"✅ 分析完了！")

print(f"結果は '{output_dir}' に保存されました")

print(f"{'='*50}\n")

return self.analysis_results

def _generate_summary_report(self, output_dir: str):

"""

分析結果のサマリーレポートを生成

"""

report_path = os.path.join(output_dir, 'analysis_report.txt')

with open(report_path, 'w', encoding='utf-8') as f:

f.write("=" * 60 + "\n")

f.write("楽曲分析レポート\n")

f.write("=" * 60 + "\n\n")

# ファイル情報

f.write("【ファイル情報】\n")

f.write(f"ファイル名: {self.analysis_results['file_info']['filename']}\n")

f.write(f"長さ: {self.analysis_results['file_info']['duration']:.2f} 秒\n")

f.write(f"分析日時: {self.analysis_results['file_info']['analysis_date']}\n")

f.write("\n")

# 構造分析

f.write("【楽曲構造】\n")

f.write(f"セクション数: {self.analysis_results['structure']['num_sections']}\n")

for i, seg in enumerate(self.analysis_results['structure']['segments'][:5]):

f.write(f" - {seg['label']}: "

f"{seg['start']:.1f}s - {seg['end']:.1f}s\n")

if len(self.analysis_results['structure']['segments']) > 5:

f.write(" ...\n")

f.write("\n")

# ハーモニー分析

f.write("【ハーモニー分析】\n")

f.write(f"推定調: {self.analysis_results['harmony']['estimated_key']}\n")

f.write(f"最頻出コード: "

f"{self.analysis_results['harmony']['chord_statistics']['most_common_chord']}\n")

f.write(f"最頻出進行: "

f"{self.analysis_results['harmony']['chord_statistics']['most_common_transition']}\n")

f.write("\n")

# リズム分析

f.write("【リズム分析】\n")

f.write(f"テンポ: {self.analysis_results['rhythm']['tempo']:.1f} BPM\n")

f.write(f"リズム安定性: "

f"{self.analysis_results['rhythm']['rhythm_stability']:.2%}\n")

f.write(f"グルーヴタイプ: {self.analysis_results['rhythm']['groove_type']}\n")

f.write(f"総ビート数: {self.analysis_results['rhythm']['num_beats']}\n")

f.write(f"小節数: {self.analysis_results['rhythm']['num_bars']}\n")

f.write("\n")

# 音響特性

f.write("【音響特性】\n")

f.write(f"ダイナミックレンジ: "

f"{self.analysis_results['acoustic']['dynamic_range']:.1f} dB\n")

f.write(f"スペクトル中心: "

f"{self.analysis_results['acoustic']['spectral_centroid']:.1f} Hz\n")

f.write("\n")

f.write("=" * 60 + "\n")

print(f"✅ 分析レポート生成: {report_path}")

def batch_analyze(self,

file_list: List[str],

output_base_dir: str = 'batch_analysis') -> Dict:

"""

複数ファイルのバッチ分析

"""

results = {}

for i, filepath in enumerate(file_list):

print(f"\n[{i+1}/{len(file_list)}] {os.path.basename(filepath)} を分析中...")

# 各ファイル用の出力ディレクトリ

filename_base = os.path.splitext(os.path.basename(filepath))[0]

output_dir = os.path.join(output_base_dir, filename_base)

try:

result = self.analyze_file(filepath, output_dir)

results[filepath] = result

except Exception as e:

print(f"❌ エラーが発生しました: {e}")

results[filepath] = {'error': str(e)}

# バッチ分析の統計

self._generate_batch_statistics(results, output_base_dir)

return results

def _generate_batch_statistics(self,

results: Dict,

output_dir: str):

"""

バッチ分析の統計情報を生成

"""

stats_path = os.path.join(output_dir, 'batch_statistics.json')

# 成功した分析のみを集計

successful_results = {

k: v for k, v in results.items() if 'error' not in v

}

if not successful_results:

return

# 統計情報の計算

tempos = [r['rhythm']['tempo'] for r in successful_results.values()]

durations = [r['file_info']['duration'] for r in successful_results.values()]

statistics = {

'total_files': len(results),

'successful_analyses': len(successful_results),

'failed_analyses': len(results) - len(successful_results),

'average_tempo': np.mean(tempos),

'tempo_std': np.std(tempos),

'average_duration': np.mean(durations),

'total_duration': sum(durations)

}

with open(stats_path, 'w') as f:

json.dump(statistics, f, indent=2)

print(f"\n✅ バッチ統計生成: {stats_path}")

実践的な使用例

1. 単一ファイルの分析

🎵 基本的な使用方法


# システムの初期化
analyzer = MusicAnalysisSystem()

# 単一ファイルの分析
results = analyzer.analyze_file(
    'path/to/your/music.mp3',
    output_dir='analysis_results'
)

# 結果の確認
print(f"テンポ: {results['rhythm']['tempo']:.1f} BPM")
print(f"推定調: {results['harmony']['estimated_key']}")
print(f"楽曲構造: {len(results['structure']['segments'])} セクション")

# システムの初期化

analyzer = MusicAnalysisSystem()

# 単一ファイルの分析

results = analyzer.analyze_file(

'path/to/your/music.mp3',

output_dir='analysis_results'

)

# 結果の確認

print(f"テンポ: {results['rhythm']['tempo']:.1f} BPM")

print(f"推定調: {results['harmony']['estimated_key']}")

print(f"楽曲構造: {len(results['structure']['segments'])} セクション")

2. バッチ処理

📁 複数ファイルの一括分析


import glob

# 分析対象ファイルのリスト作成
file_list = glob.glob('music_library/*.mp3')

# バッチ分析の実行
analyzer = MusicAnalysisSystem()
batch_results = analyzer.batch_analyze(
    file_list,
    output_base_dir='batch_analysis_results'
)

# エラーチェック
for filepath, result in batch_results.items():
    if 'error' in result:
        print(f"❌ {filepath}: {result['error']}")
    else:
        print(f"✅ {filepath}: 分析完了")

import glob

# 分析対象ファイルのリスト作成

file_list = glob.glob('music_library/*.mp3')

# バッチ分析の実行

analyzer = MusicAnalysisSystem()

batch_results = analyzer.batch_analyze(

file_list,

output_base_dir='batch_analysis_results'

)

# エラーチェック

for filepath, result in batch_results.items():

if 'error' in result:

print(f"❌ {filepath}: {result['error']}")

else:

print(f"✅ {filepath}: 分析完了")

3. カスタム分析パイプライン

⚙️ カスタマイズ例


class CustomMusicAnalyzer(MusicAnalysisSystem):
    """
    特定の用途向けにカスタマイズした分析システム
    """
    
    def analyze_for_dj(self, filepath: str) -&gt; Dict:
        """
        DJ向けの分析（BPM、キー、エネルギーレベル）
        """
        # 基本分析
        results = self.analyze_file(filepath)
        
        # DJ向け追加分析
        dj_info = {
            'bpm': results['rhythm']['tempo'],
            'key': results['harmony']['estimated_key'],
            'energy_level': self._calculate_energy_level(results),
            'mix_in_point': self._find_mix_point(results),
            'mix_out_point': self._find_mix_out_point(results)
        }
        
        return dj_info
    
    def analyze_for_cover(self, filepath: str) -&gt; Dict:
        """
        カバー演奏向けの分析（コード進行、構造、キー）
        """
        results = self.analyze_file(filepath)
        
        # カバー向け情報の抽出
        cover_info = {
            'original_key': results['harmony']['estimated_key'],
            'chord_progression': results['harmony']['chord_sequence'],
            'song_structure': self._simplify_structure(
                results['structure']['segments']
            ),
            'tempo': results['rhythm']['tempo'],
            'time_signature': self._estimate_time_signature(results)
        }
        
        return cover_info
    
    def _calculate_energy_level(self, results: Dict) -&gt; float:
        """
        エネルギーレベルを計算（0-1）
        """
        # スペクトル中心、テンポ、ダイナミックレンジから推定
        spectral_centroid = results['acoustic']['spectral_centroid']
        tempo = results['rhythm']['tempo']
        dynamic_range = results['acoustic']['dynamic_range']
        
        # 正規化と重み付け
        energy = (
            (spectral_centroid / 5000) * 0.3 +
            (tempo / 200) * 0.4 +
            (dynamic_range / 60) * 0.3
        )
        
        return min(max(energy, 0), 1)

# 使用例
custom_analyzer = CustomMusicAnalyzer()
dj_info = custom_analyzer.analyze_for_dj('path/to/track.mp3')
print(f"Energy Level: {dj_info['energy_level']:.2f}")
print(f"Mix In Point: {dj_info['mix_in_point']:.1f}s")

class CustomMusicAnalyzer(MusicAnalysisSystem):

"""

特定の用途向けにカスタマイズした分析システム

"""

def analyze_for_dj(self, filepath: str) -> Dict:

"""

DJ向けの分析（BPM、キー、エネルギーレベル）

"""

# 基本分析

results = self.analyze_file(filepath)

# DJ向け追加分析

dj_info = {

'bpm': results['rhythm']['tempo'],

'key': results['harmony']['estimated_key'],

'energy_level': self._calculate_energy_level(results),

'mix_in_point': self._find_mix_point(results),

'mix_out_point': self._find_mix_out_point(results)

}

return dj_info

def analyze_for_cover(self, filepath: str) -> Dict:

"""

カバー演奏向けの分析（コード進行、構造、キー）

"""

results = self.analyze_file(filepath)

# カバー向け情報の抽出

cover_info = {

'original_key': results['harmony']['estimated_key'],

'chord_progression': results['harmony']['chord_sequence'],

'song_structure': self._simplify_structure(

results['structure']['segments']

'tempo': results['rhythm']['tempo'],

'time_signature': self._estimate_time_signature(results)

}

return cover_info

def _calculate_energy_level(self, results: Dict) -> float:

"""

エネルギーレベルを計算（0-1）

"""

# スペクトル中心、テンポ、ダイナミックレンジから推定

spectral_centroid = results['acoustic']['spectral_centroid']

tempo = results['rhythm']['tempo']

dynamic_range = results['acoustic']['dynamic_range']

# 正規化と重み付け

energy = (

(spectral_centroid / 5000) * 0.3 +

(tempo / 200) * 0.4 +

(dynamic_range / 60) * 0.3

)

return min(max(energy, 0), 1)

# 使用例

custom_analyzer = CustomMusicAnalyzer()

dj_info = custom_analyzer.analyze_for_dj('path/to/track.mp3')

print(f"Energy Level: {dj_info['energy_level']:.2f}")

print(f"Mix In Point: {dj_info['mix_in_point']:.1f}s")

高度な応用：機械学習との統合

1. ジャンル分類器の構築

🤖 機械学習による自動ジャンル分類


from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
import pickle

class GenreClassifier:
    """
    音楽ジャンルを自動分類するクラス
    """
    def __init__(self):
        self.analyzer = MusicAnalysisSystem()
        self.scaler = StandardScaler()
        self.classifier = RandomForestClassifier(
            n_estimators=100,
            random_state=42
        )
        self.feature_names = None
        
    def extract_features_for_ml(self, filepath: str) -&gt; np.ndarray:
        """
        機械学習用の特徴量を抽出
        """
        # 包括的な分析
        results = self.analyzer.analyze_file(filepath)
        
        # 特徴量の選択
        features = []
        
        # リズム特徴
        features.extend([
            results['rhythm']['tempo'],
            results['rhythm']['rhythm_stability'],
            1 if results['rhythm']['groove_type'] == 'straight' else 0
        ])
        
        # 音響特徴
        for key in ['spectral_centroid_mean', 'spectral_bandwidth_mean',
                   'spectral_rolloff_mean', 'rms_mean', 'zcr_mean']:
            if key in results['acoustic']['key_features']:
                features.append(results['acoustic']['key_features'][key])
        
        # ハーモニー特徴
        # コード多様性
        unique_chords = len(set(
            c['chord'] for c in results['harmony']['chord_sequence']
        ))
        features.append(unique_chords)
        
        return np.array(features)
    
    def train(self, 
             training_data: List[Tuple[str, str]],
             save_model: bool = True):
        """
        ジャンル分類器を訓練
        
        Parameters:
            training_data: [(filepath, genre), ...] のリスト
            save_model: モデルを保存するか
        """
        X = []
        y = []
        
        print("特徴量を抽出中...")
        for filepath, genre in training_data:
            try:
                features = self.extract_features_for_ml(filepath)
                X.append(features)
                y.append(genre)
            except Exception as e:
                print(f"スキップ: {filepath} - {e}")
        
        X = np.array(X)
        
        # 特徴量の正規化
        X_scaled = self.scaler.fit_transform(X)
        
        # 分類器の訓練
        self.classifier.fit(X_scaled, y)
        
        # 特徴量の重要度
        importances = self.classifier.feature_importances_
        print("\n特徴量の重要度:")
        for i, imp in enumerate(importances):
            print(f"  特徴量{i}: {imp:.3f}")
        
        if save_model:
            self.save_model('genre_classifier.pkl')
    
    def predict(self, filepath: str) -&gt; Tuple[str, float]:
        """
        ジャンルを予測
        
        Returns:
            genre: 予測されたジャンル
            confidence: 予測の信頼度
        """
        features = self.extract_features_for_ml(filepath)
        features_scaled = self.scaler.transform([features])
        
        # 予測
        prediction = self.classifier.predict(features_scaled)[0]
        probabilities = self.classifier.predict_proba(features_scaled)[0]
        confidence = np.max(probabilities)
        
        return prediction, confidence
    
    def save_model(self, filepath: str):
        """
        モデルを保存
        """
        model_data = {
            'classifier': self.classifier,
            'scaler': self.scaler,
            'feature_names': self.feature_names
        }
        
        with open(filepath, 'wb') as f:
            pickle.dump(model_data, f)
        
        print(f"モデルを保存: {filepath}")
    
    def load_model(self, filepath: str):
        """
        モデルを読み込み
        """
        with open(filepath, 'rb') as f:
            model_data = pickle.load(f)
        
        self.classifier = model_data['classifier']
        self.scaler = model_data['scaler']
        self.feature_names = model_data['feature_names']
        
        print(f"モデルを読み込み: {filepath}")

# 使用例
genre_classifier = GenreClassifier()

# 訓練データの準備
training_data = [
    ('rock/song1.mp3', 'rock'),
    ('jazz/song2.mp3', 'jazz'),
    ('classical/song3.mp3', 'classical'),
    # ... 更に追加
]

# モデルの訓練
genre_classifier.train(training_data)

# 新しい曲のジャンル予測
genre, confidence = genre_classifier.predict('unknown_song.mp3')
print(f"予測ジャンル: {genre} (信頼度: {confidence:.2%})")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

from sklearn.ensemble import RandomForestClassifier

from sklearn.preprocessing import StandardScaler

import pickle

class GenreClassifier:

"""

音楽ジャンルを自動分類するクラス

"""

def __init__(self):

self.analyzer = MusicAnalysisSystem()

self.scaler = StandardScaler()

self.classifier = RandomForestClassifier(

n_estimators=100,

random_state=42

)

self.feature_names = None

def extract_features_for_ml(self, filepath: str) -> np.ndarray:

"""

機械学習用の特徴量を抽出

"""

# 包括的な分析

results = self.analyzer.analyze_file(filepath)

# 特徴量の選択

features = []

# リズム特徴

features.extend([

results['rhythm']['tempo'],

results['rhythm']['rhythm_stability'],

1 if results['rhythm']['groove_type'] == 'straight' else 0

])

# 音響特徴

for key in ['spectral_centroid_mean', 'spectral_bandwidth_mean',

'spectral_rolloff_mean', 'rms_mean', 'zcr_mean']:

if key in results['acoustic']['key_features']:

features.append(results['acoustic']['key_features'][key])

# ハーモニー特徴

# コード多様性

unique_chords = len(set(

c['chord'] for c in results['harmony']['chord_sequence']

))

features.append(unique_chords)

return np.array(features)

def train(self,

training_data: List[Tuple[str, str]],

save_model: bool = True):

"""

ジャンル分類器を訓練

Parameters:

training_data: [(filepath, genre), ...] のリスト

save_model: モデルを保存するか

"""

X = []

y = []

print("特徴量を抽出中...")

for filepath, genre in training_data:

try:

features = self.extract_features_for_ml(filepath)

X.append(features)

y.append(genre)

except Exception as e:

print(f"スキップ: {filepath} - {e}")

X = np.array(X)

# 特徴量の正規化

X_scaled = self.scaler.fit_transform(X)

# 分類器の訓練

self.classifier.fit(X_scaled, y)

# 特徴量の重要度

importances = self.classifier.feature_importances_

print("\n特徴量の重要度:")

for i, imp in enumerate(importances):

print(f" 特徴量{i}: {imp:.3f}")

if save_model:

self.save_model('genre_classifier.pkl')

def predict(self, filepath: str) -> Tuple[str, float]:

"""

ジャンルを予測

Returns:

genre: 予測されたジャンル

confidence: 予測の信頼度

"""

features = self.extract_features_for_ml(filepath)

features_scaled = self.scaler.transform([features])

# 予測

prediction = self.classifier.predict(features_scaled)[0]

probabilities = self.classifier.predict_proba(features_scaled)[0]

confidence = np.max(probabilities)

return prediction, confidence

def save_model(self, filepath: str):

"""

モデルを保存

"""

model_data = {

'classifier': self.classifier,

'scaler': self.scaler,

'feature_names': self.feature_names

}

with open(filepath, 'wb') as f:

pickle.dump(model_data, f)

print(f"モデルを保存: {filepath}")

def load_model(self, filepath: str):

"""

モデルを読み込み

"""

with open(filepath, 'rb') as f:

model_data = pickle.load(f)

self.classifier = model_data['classifier']

self.scaler = model_data['scaler']

self.feature_names = model_data['feature_names']

print(f"モデルを読み込み: {filepath}")

# 使用例

genre_classifier = GenreClassifier()

# 訓練データの準備

training_data = [

('rock/song1.mp3', 'rock'),

('jazz/song2.mp3', 'jazz'),

('classical/song3.mp3', 'classical'),

# ... 更に追加

]

# モデルの訓練

genre_classifier.train(training_data)

# 新しい曲のジャンル予測

genre, confidence = genre_classifier.predict('unknown_song.mp3')

print(f"予測ジャンル: {genre} (信頼度: {confidence:.2%})")

パフォーマンス最適化

1. 並列処理による高速化

⚡ マルチプロセッシングの活用


import multiprocessing as mp
from functools import partial

class ParallelMusicAnalyzer:
    """
    並列処理で高速化した分析システム
    """
    def __init__(self, n_workers: int = None):
        self.n_workers = n_workers or mp.cpu_count()
        
    def analyze_batch_parallel(self, 
                             file_list: List[str],
                             output_dir: str = 'parallel_analysis') -&gt; Dict:
        """
        並列処理でバッチ分析
        """
        # 各ワーカーの関数
        def analyze_single(filepath, output_base_dir):
            try:
                analyzer = MusicAnalysisSystem()
                filename_base = os.path.splitext(
                    os.path.basename(filepath)
                )[0]
                output_path = os.path.join(output_base_dir, filename_base)
                
                result = analyzer.analyze_file(filepath, output_path)
                return filepath, result
            except Exception as e:
                return filepath, {'error': str(e)}
        
        # 部分関数の作成
        analyze_func = partial(analyze_single, output_base_dir=output_dir)
        
        # プロセスプールで並列実行
        with mp.Pool(self.n_workers) as pool:
            results = pool.map(analyze_func, file_list)
        
        # 結果を辞書に変換
        return dict(results)
    
    def analyze_segments_parallel(self, 
                                filepath: str,
                                segment_duration: float = 30.0) -&gt; List[Dict]:
        """
        楽曲を分割して並列分析
        """
        # 音声ファイルを読み込み
        processor = AudioProcessor()
        processor.load_audio(filepath)
        
        # セグメントに分割
        segments = processor.extract_segments(
            segment_duration=segment_duration,
            overlap=0.1
        )
        
        # 各セグメントを並列分析
        def analyze_segment(segment_data):
            segment_audio, segment_idx = segment_data
            
            # 一時的なプロセッサを作成
            temp_processor = AudioProcessor()
            temp_processor.audio_data = segment_audio
            temp_processor.sr = processor.sr
            temp_processor.duration = len(segment_audio) / processor.sr
            
            # 各種分析
            results = {}
            
            # 音響分析
            acoustic = AcousticAnalyzer(temp_processor)
            results['acoustic'] = acoustic.extract_all_features()
            
            # リズム分析
            rhythm = RhythmAnalyzer(temp_processor)
            results['tempo'] = rhythm.estimate_tempo()
            
            return segment_idx, results
        
        # セグメントとインデックスのペアを作成
        segment_data = [(seg, i) for i, seg in enumerate(segments)]
        
        # 並列処理
        with mp.Pool(self.n_workers) as pool:
            segment_results = pool.map(analyze_segment, segment_data)
        
        # 結果をソート
        segment_results.sort(key=lambda x: x[0])
        
        return [result[1] for result in segment_results]

# 使用例
parallel_analyzer = ParallelMusicAnalyzer(n_workers=8)

# 大量ファイルの並列分析
import time

start_time = time.time()
results = parallel_analyzer.analyze_batch_parallel(
    file_list=['song1.mp3', 'song2.mp3', 'song3.mp3'],
    output_dir='parallel_results'
)
end_time = time.time()

print(f"分析完了: {end_time - start_time:.2f} 秒")

100

101

102

103

104

105

106

107

import multiprocessing as mp

from functools import partial

class ParallelMusicAnalyzer:

"""

並列処理で高速化した分析システム

"""

def __init__(self, n_workers: int = None):

self.n_workers = n_workers or mp.cpu_count()

def analyze_batch_parallel(self,

file_list: List[str],

output_dir: str = 'parallel_analysis') -> Dict:

"""

並列処理でバッチ分析

"""

# 各ワーカーの関数

def analyze_single(filepath, output_base_dir):

try:

analyzer = MusicAnalysisSystem()

filename_base = os.path.splitext(

os.path.basename(filepath)

)[0]

output_path = os.path.join(output_base_dir, filename_base)

result = analyzer.analyze_file(filepath, output_path)

return filepath, result

except Exception as e:

return filepath, {'error': str(e)}

# 部分関数の作成

analyze_func = partial(analyze_single, output_base_dir=output_dir)

# プロセスプールで並列実行

with mp.Pool(self.n_workers) as pool:

results = pool.map(analyze_func, file_list)

# 結果を辞書に変換

return dict(results)

def analyze_segments_parallel(self,

filepath: str,

segment_duration: float = 30.0) -> List[Dict]:

"""

楽曲を分割して並列分析

"""

# 音声ファイルを読み込み

processor = AudioProcessor()

processor.load_audio(filepath)

# セグメントに分割

segments = processor.extract_segments(

segment_duration=segment_duration,

overlap=0.1

)

# 各セグメントを並列分析

def analyze_segment(segment_data):

segment_audio, segment_idx = segment_data

# 一時的なプロセッサを作成

temp_processor = AudioProcessor()

temp_processor.audio_data = segment_audio

temp_processor.sr = processor.sr

temp_processor.duration = len(segment_audio) / processor.sr

# 各種分析

results = {}

# 音響分析

acoustic = AcousticAnalyzer(temp_processor)

results['acoustic'] = acoustic.extract_all_features()

# リズム分析

rhythm = RhythmAnalyzer(temp_processor)

results['tempo'] = rhythm.estimate_tempo()

return segment_idx, results

# セグメントとインデックスのペアを作成

segment_data = [(seg, i) for i, seg in enumerate(segments)]

# 並列処理

with mp.Pool(self.n_workers) as pool:

segment_results = pool.map(analyze_segment, segment_data)

# 結果をソート

segment_results.sort(key=lambda x: x[0])

return [result[1] for result in segment_results]

# 使用例

parallel_analyzer = ParallelMusicAnalyzer(n_workers=8)

# 大量ファイルの並列分析

import time

start_time = time.time()

results = parallel_analyzer.analyze_batch_parallel(

file_list=['song1.mp3', 'song2.mp3', 'song3.mp3'],

output_dir='parallel_results'

)

end_time = time.time()

print(f"分析完了: {end_time - start_time:.2f} 秒")

実用的なアプリケーション例

1. プレイリスト自動生成

🎧 類似楽曲に基づくプレイリスト生成


from scipy.spatial.distance import cosine
from collections import defaultdict

class PlaylistGenerator:
    """
    楽曲分析に基づいてプレイリストを自動生成
    """
    def __init__(self, music_library: List[str]):
        self.analyzer = MusicAnalysisSystem()
        self.library = music_library
        self.features_cache = {}
        
    def analyze_library(self):
        """
        音楽ライブラリ全体を分析
        """
        print("ライブラリを分析中...")
        
        for filepath in self.library:
            if filepath not in self.features_cache:
                try:
                    results = self.analyzer.analyze_file(filepath)
                    # 特徴ベクトルを作成
                    features = self._extract_feature_vector(results)
                    self.features_cache[filepath] = features
                except Exception as e:
                    print(f"エラー: {filepath} - {e}")
    
    def _extract_feature_vector(self, analysis_results: Dict) -&gt; np.ndarray:
        """
        分析結果から特徴ベクトルを抽出
        """
        features = []
        
        # テンポ（正規化）
        tempo = analysis_results['rhythm']['tempo']
        features.append(tempo / 200.0)  # 0-200 BPMを想定
        
        # リズム安定性
        features.append(analysis_results['rhythm']['rhythm_stability'])
        
        # スペクトル中心（正規化）
        spectral_centroid = analysis_results['acoustic']['spectral_centroid']
        features.append(spectral_centroid / 10000.0)
        
        # その他の特徴...
        
        return np.array(features)
    
    def generate_playlist(self, 
                         seed_track: str,
                         playlist_length: int = 20,
                         diversity: float = 0.3) -&gt; List[str]:
        """
        シード曲に基づいてプレイリストを生成
        
        Parameters:
            seed_track: 起点となる楽曲
            playlist_length: プレイリストの長さ
            diversity: 多様性パラメータ（0-1）
            
        Returns:
            playlist: 楽曲パスのリスト
        """
        if seed_track not in self.features_cache:
            results = self.analyzer.analyze_file(seed_track)
            seed_features = self._extract_feature_vector(results)
            self.features_cache[seed_track] = seed_features
        else:
            seed_features = self.features_cache[seed_track]
        
        # 類似度計算
        similarities = {}
        for filepath, features in self.features_cache.items():
            if filepath != seed_track:
                similarity = 1 - cosine(seed_features, features)
                similarities[filepath] = similarity
        
        # ソート
        sorted_tracks = sorted(
            similarities.items(), 
            key=lambda x: x[1], 
            reverse=True
        )
        
        # プレイリスト作成
        playlist = [seed_track]
        
        # 多様性を考慮した選択
        for track, similarity in sorted_tracks:
            if len(playlist) &gt;= playlist_length:
                break
            
            # ランダム要素を加えて多様性を確保
            if np.random.random() &lt; (1 - diversity) or similarity &gt; 0.8:
                playlist.append(track)
        
        return playlist
    
    def generate_mood_playlist(self, 
                             mood: str,
                             length: int = 20) -&gt; List[str]:
        """
        ムードに基づいてプレイリストを生成
        """
        mood_criteria = {
            'energetic': {
                'min_tempo': 120,
                'min_spectral_centroid': 3000
            },
            'relaxing': {
                'max_tempo': 100,
                'max_spectral_centroid': 2000
            },
            'uplifting': {
                'min_tempo': 110,
                'major_key': True
            }
        }
        
        if mood not in mood_criteria:
            raise ValueError(f"Unknown mood: {mood}")
        
        criteria = mood_criteria[mood]
        suitable_tracks = []
        
        # 条件に合う楽曲を選択
        for filepath in self.library:
            if filepath in self.features_cache:
                # ここで条件チェック（簡略化）
                suitable_tracks.append(filepath)
        
        # ランダムに選択
        if len(suitable_tracks) &gt; length:
            playlist = np.random.choice(
                suitable_tracks, 
                size=length, 
                replace=False
            ).tolist()
        else:
            playlist = suitable_tracks
        
        return playlist

# 使用例
library = glob.glob('music_library/**/*.mp3', recursive=True)
playlist_gen = PlaylistGenerator(library)

# ライブラリを分析
playlist_gen.analyze_library()

# シード曲からプレイリスト生成
playlist = playlist_gen.generate_playlist(
    seed_track='favorite_song.mp3',
    playlist_length=20,
    diversity=0.3
)

print("生成されたプレイリスト:")
for i, track in enumerate(playlist, 1):
    print(f"{i}. {os.path.basename(track)}")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

from scipy.spatial.distance import cosine

from collections import defaultdict

class PlaylistGenerator:

"""

楽曲分析に基づいてプレイリストを自動生成

"""

def __init__(self, music_library: List[str]):

self.analyzer = MusicAnalysisSystem()

self.library = music_library

self.features_cache = {}

def analyze_library(self):

"""

音楽ライブラリ全体を分析

"""

print("ライブラリを分析中...")

for filepath in self.library:

if filepath not in self.features_cache:

try:

results = self.analyzer.analyze_file(filepath)

# 特徴ベクトルを作成

features = self._extract_feature_vector(results)

self.features_cache[filepath] = features

except Exception as e:

print(f"エラー: {filepath} - {e}")

def _extract_feature_vector(self, analysis_results: Dict) -> np.ndarray:

"""

分析結果から特徴ベクトルを抽出

"""

features = []

# テンポ（正規化）

tempo = analysis_results['rhythm']['tempo']

features.append(tempo / 200.0) # 0-200 BPMを想定

# リズム安定性

features.append(analysis_results['rhythm']['rhythm_stability'])

# スペクトル中心（正規化）

spectral_centroid = analysis_results['acoustic']['spectral_centroid']

features.append(spectral_centroid / 10000.0)

# その他の特徴...

return np.array(features)

def generate_playlist(self,

seed_track: str,

playlist_length: int = 20,

diversity: float = 0.3) -> List[str]:

"""

シード曲に基づいてプレイリストを生成

Parameters:

seed_track: 起点となる楽曲

playlist_length: プレイリストの長さ

diversity: 多様性パラメータ（0-1）

Returns:

playlist: 楽曲パスのリスト

"""

if seed_track not in self.features_cache:

results = self.analyzer.analyze_file(seed_track)

seed_features = self._extract_feature_vector(results)

self.features_cache[seed_track] = seed_features

else:

seed_features = self.features_cache[seed_track]

# 類似度計算

similarities = {}

for filepath, features in self.features_cache.items():

if filepath != seed_track:

similarity = 1 - cosine(seed_features, features)

similarities[filepath] = similarity

# ソート

sorted_tracks = sorted(

similarities.items(),

key=lambda x: x[1],

reverse=True

)

# プレイリスト作成

playlist = [seed_track]

# 多様性を考慮した選択

for track, similarity in sorted_tracks:

if len(playlist) >= playlist_length:

break

# ランダム要素を加えて多様性を確保

if np.random.random() < (1 - diversity) or similarity > 0.8:

playlist.append(track)

return playlist

def generate_mood_playlist(self,

mood: str,

length: int = 20) -> List[str]:

"""

ムードに基づいてプレイリストを生成

"""

mood_criteria = {

'energetic': {

'min_tempo': 120,

'min_spectral_centroid': 3000

'relaxing': {

'max_tempo': 100,

'max_spectral_centroid': 2000

'uplifting': {

'min_tempo': 110,

'major_key': True

}

if mood not in mood_criteria:

raise ValueError(f"Unknown mood: {mood}")

criteria = mood_criteria[mood]

suitable_tracks = []

# 条件に合う楽曲を選択

for filepath in self.library:

if filepath in self.features_cache:

# ここで条件チェック（簡略化）

suitable_tracks.append(filepath)

# ランダムに選択

if len(suitable_tracks) > length:

playlist = np.random.choice(

suitable_tracks,

size=length,

replace=False

).tolist()

else:

playlist = suitable_tracks

return playlist

# 使用例

library = glob.glob('music_library/**/*.mp3', recursive=True)

playlist_gen = PlaylistGenerator(library)

# ライブラリを分析

playlist_gen.analyze_library()

# シード曲からプレイリスト生成

playlist = playlist_gen.generate_playlist(

seed_track='favorite_song.mp3',

playlist_length=20,

diversity=0.3

)

print("生成されたプレイリスト:")

for i, track in enumerate(playlist, 1):

print(f"{i}. {os.path.basename(track)}")

まとめ：音楽分析の未来

本記事では、Pythonを使った包括的な楽曲分析システムの構築方法を詳しく解説しました。主要なポイントをまとめます：

音声処理の基礎：librosaを中心とした音声ファイルの読み込みと前処理
楽曲構造分析：自己類似度行列とセグメント検出による構造の自動認識
コード進行検出：クロマベクトルとテンプレートマッチングによる和声解析
リズム分析：テンポ推定、ビート検出、グルーヴ分析
音響特徴抽出：スペクトル、音色、ダイナミクスの包括的分析
機械学習統合：ジャンル分類や類似楽曲検索への応用
実用的応用：プレイリスト生成、DJ支援、カバー演奏支援

これらの技術により、音楽制作の効率化、新たな創作の可能性、音楽体験の向上が実現できます。AIと音楽理論の融合により、誰もがプロフェッショナルレベルの楽曲分析を行える時代が到来しています。

次回は「AI作曲支援ツールの活用法」について、本記事で紹介した分析技術を基に、実際の作曲活動を支援するツールの開発方法を解説予定です。音楽とテクノロジーの融合により、創造性の新たな地平が開かれることを期待しています。

DTMを始めよう！初心者向けガイド「DTM」という言葉、耳にしたことはありますか？音楽制作に興味があるけど、機材や知識がなくて、どこから手をつければいいか悩んでいる方も多...

DTMミックスにおける周波数管理：楽器ごとの帯域と棲み分け徹底ガイド DTMでのミックス作業において、周波数管理は非常に重要な要素です。各楽器の音域が重なり合ってしまうと、音が濁ってしまい、せっかく...

【2025年保存版】アニソンでよく使われるコード進行25選！作曲・DTMに活かせる定番パターン完全解説この記事で分かることアニソンでよく使われる定番コード進行25選各コード進行の具体的な使用例と効果初心...

DTM プログラミング

なぜPythonで楽曲分析を自動化するのか？

🎵 Python楽曲分析の利点

必要なライブラリとセットアップ

1. 基本環境の構築

📦 必須ライブラリのインストール

2. 開発環境の推奨設定

💻 推奨開発環境

基本的な楽曲分析システムの構築

1. 音声ファイルの読み込みと前処理

🎼 音声ファイル処理の基本クラス

2. 楽曲構造の自動検出

🏗️ 楽曲構造分析クラス

コード進行の自動検出

1. クロマベクトルを使用したコード検出

🎹 コード進行検出クラス

リズム・ビート分析

1. テンポとビート検出

🥁 リズム分析クラス

音響特徴の詳細分析

1. スペクトル分析と音色特徴

🎵 音響特徴分析クラス

統合分析システムの構築

1. 包括的な楽曲分析クラス

🎯 統合分析システム

実践的な使用例

1. 単一ファイルの分析

🎵 基本的な使用方法

2. バッチ処理

📁 複数ファイルの一括分析

3. カスタム分析パイプライン

⚙️ カスタマイズ例

高度な応用：機械学習との統合

1. ジャンル分類器の構築

🤖 機械学習による自動ジャンル分類

パフォーマンス最適化

1. 並列処理による高速化

⚡ マルチプロセッシングの活用

実用的なアプリケーション例

1. プレイリスト自動生成

🎧 類似楽曲に基づくプレイリスト生成

まとめ：音楽分析の未来

コードボイシングの重要性！響きを決定する音の配置テクニック

転調を含むコード進行パターン

sus4コードで作る浮遊感のある進行：解決と緊張の美しいバランス【DTM活用法】

DTMソフト徹底解説2025年最新版：選び方と使いこなし術

JavaScriptでコード進行からMIDIファイルを自動生成する方法：Web Audio APIとプログラミングの融合

【モブ・コントロール広告を消せる！】裏技紹介＆楽しいのでプレイしてみた！

ギター音作りのヒント①（音がこもる原因とピックアップごとの選び方や特長について）

【苦手な人向け】キングダムハーツRe:チェイン オブ メモリーズ 序盤のカードデッキこれでOKを紹介！！

【初心者向け】カッティングギターにおけるジャキジャキした音作りについて、デモ音源付きで解説！

あにPブログ

【苦手な人向け】キングダムハーツRe:チェインオブメモリーズ序盤のカードデッキこれでOKを紹介！！