PolySoundex 1.0.1
dotnet add package PolySoundex --version 1.0.1
NuGet\Install-Package PolySoundex -Version 1.0.1
<PackageReference Include="PolySoundex" Version="1.0.1" />
paket add PolySoundex --version 1.0.1
#r "nuget: PolySoundex, 1.0.1"
// Install PolySoundex as a Cake Addin #addin nuget:?package=PolySoundex&version=1.0.1 // Install PolySoundex as a Cake Tool #tool nuget:?package=PolySoundex&version=1.0.1
PolySoundex Library - Developer Documentation
Overview
PolySoundex is a multi-language Soundex implementation that allows developers to configure Soundex mappings for multiple languages and use them for phonetic search and indexing. The library supports various alphabets, including English and Nepali (Devanagari script), and can be extended to support additional languages.
This documentation provides a guide for configuring and using PolySoundex in your .NET applications, specifically with ASP.NET Core dependency injection.
Installation
Add the PolySoundex Library to your .NET project (assume it's available as a NuGet package).
Configure Dependency Injection in your application using the
AddPolySoundex
method, where you define language-specific Soundex rules and regex patterns for identifying different languages.
Configuration Example
Here is an example of how you can configure PolySoundex for both English and Nepali languages using the AddPolySoundex
extension method.
Dependency Injection Configuration
In your Program.cs
(for ASP.NET Core applications):
builder.Services.AddPolySoundex(config =>
{
// English Soundex Configuration
config.Add(new PolySoundexConfig()
{
Requirements = new Dictionary<int, char[]>()
{
{ 1, new[] { 'B', 'F', 'P', 'V' } },
{ 2, new[] { 'C', 'G', 'J', 'K', 'Q', 'S', 'X', 'Z' } },
{ 3, new[] { 'D', 'T' } },
{ 4, new[] { 'L' } },
{ 5, new[] { 'M', 'N' } },
{ 6, new[] { 'R' } }
},
LanguageIdentifierRegex = new Regex("^[A-Za-z]+$") // Regex for English alphabet
});
// Nepali Soundex (Devanagari script) Configuration
config.Add(new PolySoundexConfig
{
Requirements = new Dictionary<int, char[]>()
{
{ 1, new[] { 'क', 'ख', 'ग', 'घ' } },
{ 2, new[] { 'च', 'छ', 'ज', 'झ', 'ट', 'ठ', 'ड', 'ढ' } },
{ 3, new[] { 'त', 'थ', 'द', 'ध' } },
{ 4, new[] { 'न' } },
{ 5, new[] { 'प', 'फ', 'ब', 'भ', 'म' } },
{ 6, new[] { 'य', 'र', 'ल', 'व' } }
},
LanguageIdentifierRegex = new Regex("^[\u0900-\u097F]+$") // Regex for Devanagari (Nepali) script
});
});
Explanation:
- Requirements: A dictionary mapping integers (Soundex codes) to arrays of characters that share the same phonetic value. For example, in English, 'B', 'F', 'P', and 'V' are treated similarly by Soundex.
- LanguageIdentifierRegex: A regular expression used to identify the input language. In this case:
^[A-Za-z]+$
matches English characters.^[\u0900-\u097F]+$
matches Nepali (Devanagari script) characters.
Example
You can now use the PolySoundex
service in your app to generate Soundex values. Below is an example of how to create an API endpoint that takes user input and returns the Soundex value:
app.MapGet("/", (string input, IPolySoundex soundex) =>
{
try
{
var items = new List<(string name, string address)>()
{
new("Biraj", "123 Main St."),
new("Ram", "456 Elm St."),
new("Hari", "789 Oak St.")
};
var inputSoundex = soundex.GenerateSoundex(input); // assume input is "Braj" the soundex code is "B620"
var matchingItems = items
.Select(i => new { i.name, i.address, Soundex = soundex.GenerateSoundex(i.name.ToUpper()) })
.Where(i => i.Soundex == inputSoundex)
.ToList(); // returns { name = "Biraj", address = "123 Main St.", Soundex = "B620" }
return Results.Ok(matchingItems);
}
catch (Exception e)
{
return Results.BadRequest(e.Message);
}
});
What This Does:
- Accepts a
GET
request with astring
input. - Uses the
PolySoundex
service (injected viaIPolySoundex
) to generate the Soundex code for the given input. - Returns the Soundex result or an error if the language is not supported.
Example Usage:
English Example:
- Input:
Biraj
- Soundex Result:
B620
- Input:
Nepali Example:
- Input:
बिराज
(Biraj in Devanagari) - Soundex Result:
ब620
- Input:
How to Extend PolySoundex for Other Languages
To add support for other languages:
- Define the Soundex rules for the language in the
Requirements
dictionary. - Create an appropriate regular expression for identifying the alphabet or script of the language.
- Add the new
PolySoundexConfig
in theAddPolySoundex
configuration.
Example for another language:
config.Add(new PolySoundexConfig()
{
Requirements = new Dictionary<int, char[]>()
{
{ 1, new[] { 'А', 'О', 'У', 'Э', 'Ю', 'Я' } }, // Example Cyrillic characters
// Add other mappings for the Cyrillic alphabet
},
LanguageIdentifierRegex = new Regex("^[\u0400-\u04FF]+$") // Cyrillic script
});
Error Handling
- Unsupported Language: If the input does not match any configured languages, the
PolySoundexException
will be thrown with the messageLanguage not supported.
- Invalid Input: If the input string does not contain valid characters after being cleaned, an
ArgumentException
will be thrown with a message likeInput string must contain at least one valid character after cleaning.
Edge Cases
- Empty Input: The Soundex function will return the input as-is.
- Digits Only Input: Soundex will return the input unchanged (useful for phone numbers).
- Input with Mixed Languages: The
LanguageIdentifierRegex
ensures that input is matched against the correct language. If the input doesn’t fully conform to one language, an error will be thrown.
Conclusion
PolySoundex offers a flexible and extendable approach to handling multi-language phonetic search and Soundex generation. By configuring language-specific rules and utilizing regex for language identification, you can adapt it to various languages like English, Nepali, and more.
This makes it a powerful tool for building search functionality in applications where phonetic matching of names or other text is crucial, particularly in multi-lingual environments like legal, governmental, or educational systems.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.1 is compatible. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
-
.NETCoreApp 2.1
- Microsoft.Extensions.DependencyInjection (>= 2.1.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.