実践編③：音声認識AIを活用したメモアプリ開発

会議やアイデア出しのときに「音声をすぐに文字にしたい」と思ったことはありませんか？
そんなニーズに応えてくれるのが、音声認識AIを活用したメモアプリです。

従来の音声入力は誤変換が多くストレスになりがちでしたが、
2025年現在では OpenAI Whisper や Google Speech-to-Text など高精度の音声認識AIが登場し、
「ほぼリアルタイム」で正確な文字起こしが可能になっています。

本記事では、音声を即座にテキスト化し、保存できるスマホアプリの作り方を解説します。

1. 開発の全体フロー
2. Whisper APIの使い方
3. Flutterでの実装例
4. 応用アイデア
5. コストと最適化のポイント
6. まとめ

1. 開発の全体フロー

音声認識メモアプリは次の流れで開発します。

開発環境の選定
- iOS（Swift）、Android（Kotlin）、クロスプラットフォームなら Flutter や React Native
音声入力の実装
- マイクから音声を取得
AI音声認識APIに送信
- Whisper API や Google Speech-to-Text で文字起こし
テキスト結果を表示
- 画面にリアルタイムで変換結果を表示
保存・管理機能
- メモとして保存、タグ付け、クラウド同期

2. Whisper APIの使い方

OpenAIのWhisper APIを利用した場合、基本的なリクエストは以下です。

POST https://api.openai.com/v1/audio/transcriptions
Headers:
  Authorization: Bearer YOUR_API_KEY
  Content-Type: multipart/form-data

Body:
  - file: audio.mp3
  - model: whisper-1
  - language: ja

これで、アップロードした音声ファイルをテキストに変換できます。
スマホアプリでは、録音した音声を一時保存してAPIに送る流れになります。

3. Flutterでの実装例

① 必要なパッケージ

dependencies:
  http: ^1.0.0
  record: ^5.0.0   # 音声録音用

② 音声を録音

import 'package:record/record.dart';

final record = Record();

Future<void> startRecording() async {
  if (await record.hasPermission()) {
    await record.start(path: 'audio.mp3');
  }
}

Future<String?> stopRecording() async {
  return await record.stop(); // ファイルパスを返す
}

③ Whisper APIで文字起こし

import 'dart:convert';
import 'dart:io';
import 'package:http/http.dart' as http;

Future<String> transcribeAudio(String filePath) async {
  const apiKey = "YOUR_API_KEY";
  var request = http.MultipartRequest(
    'POST',
    Uri.parse('https://api.openai.com/v1/audio/transcriptions'),
  );

  request.headers['Authorization'] = 'Bearer $apiKey';
  request.files.add(await http.MultipartFile.fromPath('file', filePath));
  request.fields['model'] = 'whisper-1';
  request.fields['language'] = 'ja';

  final response = await request.send();
  final respStr = await response.stream.bytesToString();
  final data = jsonDecode(respStr);

  return data['text'];
}