Keyword grouping for Google Ads: a machine learning approach

Aug 20, 2024

Keyword grouping is a critical step in setting up successful Google Ads campaigns. It involves researching and organizing keywords into coherent clusters that align with specific ad groups. The quality of keyword grouping can significantly impact your ad rank, quality score, and overall campaign performance. However, when dealing with large datasets—sometimes containing tens of thousands of keywords—manual grouping becomes tedious, time-consuming, and prone to errors. This is where machine learning (ML) can offer a powerful and efficient solution.

The challenge of manual keyword grouping

When managing Google Ads, especially in competitive industries or large-scale campaigns, keyword research often yields an overwhelming amount of data. While manual grouping is feasible for smaller datasets, handling hundreds or thousands of keywords manually can lead to inefficiencies and mistakes. Incorrectly grouped keywords may result in lower ad relevance, poorer quality scores, and ultimately, decreased return on ad spend (ROAS).

Introducing machine learning to keyword grouping

Machine learning offers a sophisticated method for automating keyword grouping. By leveraging natural language processing (NLP) and advanced matching algorithms, ML can automatically analyze and group keywords based on their semantic relevance. One effective method is 'fuzzy matching,' which uses a technique called Levenshtein distance to measure similarity between keywords.

What is fuzzy matching?

Fuzzy matching is a method of finding strings that are approximately equal. Unlike exact matching, which only pairs identical strings, fuzzy matching allows for minor differences, making it highly effective for grouping similar keywords. The Levenshtein distance, at its core, calculates how many single-character edits (insertions, deletions, or substitutions) are needed to transform one string into another. For example, transforming 'kitten' into 'sitting' requires three edits, resulting in a Levenshtein distance of three.

Implementing fuzzy matching in Google Sheets

To make this advanced method accessible, I've developed a Google Sheet tool that performs keyword grouping using fuzzy matching. The tool requires two inputs: 'actual keywords' and 'seed keywords.'

Actual keywords: These are the keywords collected from your research, such as those sourced from Google Keyword Planner or SEMrush.
Seed keywords: These represent the primary topic clusters or ad groups you want to assign your keywords to.

Once you input these lists, the tool automatically matches each keyword with the most relevant seed keyword, providing a match score that indicates how strong the connection is.

Understanding the match score

The match score is calculated using the formula:

Match score = (max string length - Levenshtein distance) / max string length

A higher score indicates a closer match, helping you identify which keywords fit best within each ad group. If the match score is low, you might need to refine your seed keywords or adjust your topic clusters to improve the results.

Beyond fuzzy matching: other machine learning approaches

While fuzzy matching is a robust technique, other methods can further enhance keyword grouping:

Jaro-Winkler distance: Useful for short strings and prioritizes common prefixes.
Cosine similarity: Measures similarity by comparing the angle between two vectors, often used for word embeddings.
Token-based similarity: Breaks down strings into tokens (words) and compares sets of tokens using methods like TF-IDF (term frequency-inverse document frequency).
Semantic similarity: Utilizes advanced models like Word2Vec or BERT to understand the contextual meaning of keywords.

Advantages of using machine learning for keyword grouping

Speed: Machine learning can process thousands of keywords in seconds, saving valuable time.
Accuracy: Reduces human error by automating the grouping process based on objective algorithms.
Scalability: Easily handles large datasets that would be impractical to manage manually.

Potential limitations

While the Google Sheet tool offers an accessible solution, it does have some limitations. The approximation of ML algorithms within Google Sheets cannot match the full potential of running these models in a dedicated programming environment like Python. However, for most marketers and advertisers, this tool strikes a balance between simplicity and functionality.

Machine learning, particularly through fuzzy matching, offers a practical and efficient solution for keyword grouping in Google Ads campaigns. While there are more advanced methods available, this Google Sheet tool provides a valuable starting point, especially for those without programming expertise. By automating keyword grouping, marketers can enhance their ad relevance, improve quality scores, and ultimately drive better performance from their Google Ads campaigns.

If you're interested in a more advanced, standalone application or custom coding solutions for keyword grouping, feel free to reach out. As machine learning technology continues to evolve, so will the tools and techniques available to optimize your advertising strategies.

How to Use:

Open the Google Sheet:

Create a new Google Sheet or open an existing one.

Prepare Your Input Data:

In a sheet named "Sheet1," create two columns:

Seed Keywords: List of seed keywords.
Actual Keywords: List of keywords to match against the seed keywords.

Add the Script:

Go to Extensions → Apps Script in your Google Sheet.
Replace any existing code with the provided script.

Save and Authorize the Script:

Save the script (e.g., name it FuzzyMatch).
Click the play ▶️ button or go to Run → fuzzyMatchKeywords.
Grant necessary permissions for the script to access your Google Sheets.

Run the Script:

After running the script, the fuzzy matching results will be output in a new sheet named "Fuzzy Match Output".
The output will contain three columns:

Match Keyword: The keyword from the "Actual Keywords" column.
Best Seed Keyword: The closest match from the "Seed Keywords" column.
Match Score: The similarity score (percentage).

Check the Output:

The results will be available in the "Fuzzy Match Output" tab in the same spreadsheet.

The script I provided uses Levenshtein distance (edit distance) as the underlying algorithm for calculating similarity

Match Score Calculation:

The script calculates the similarity as

Why It’s Fuzzy Matching:

Levenshtein Distance-Based:

The Apps Script implementation uses Levenshtein distance, which is a foundational algorithm for fuzzy matching. This makes it a form of fuzzy matching.

Score Calculation:

The script calculates similarity scores based on the number of edits required to make two strings identical, which aligns with the concept of "fuzziness" in string matching.

1. String Similarity Approaches

These methods compare the text of the strings directly:

a. Levenshtein Distance (Edit Distance)

Measures the minimum number of single-character edits (insertions, deletions, substitutions) needed to transform one string into another.
Common Libraries: fuzzywuzzy, rapidfuzz, or custom implementations.
Strength: Good for simple spelling differences or typos.
Weakness: May not perform well for semantically different but related terms.

b. Jaro-Winkler Distance

Focuses on matching characters and their positions but gives higher weight to the beginning of strings.
Suitable for shorter strings or when prefixes are important.
Common Libraries: jellyfish (Python), string-similarity (JavaScript).

c. Cosine Similarity (Character-Based)

Converts strings into character n-grams and compares them as vectors.
Measures the cosine of the angle between the vectors.
Strength: Works well when the string order is less critical.

2. Token-Based Similarity Approaches

These methods split strings into words (tokens) and analyze their similarity:

a. Jaccard Similarity

Compares the overlap of tokens between two sets:

Works well for phrases with shared words.
Common Libraries: sklearn.metrics.jaccard_score.

b. Cosine Similarity (Word-Based)

Treats each string as a vector of word frequencies or TF-IDF weights.
Measures similarity between word frequency vectors.
Strength: Works well for longer sentences or phrases.

c. TF-IDF Vectorization

Converts text into a numerical representation using Term Frequency-Inverse Document Frequency (TF-IDF).
Measures how unique or significant words are in the context of the dataset.
Libraries: scikit-learn (Python).

d. Token Set and Token Sort Ratios

Used by fuzzywuzzy:

Token Set Ratio: Ignores duplicate words and matches unique sets of tokens.
Token Sort Ratio: Sorts tokens alphabetically before comparing.

3. Semantic Similarity Approaches

These methods go beyond string matching to capture the meaning of the words:

a. Word Embeddings (Word2Vec, GloVe, FastText)

Represent words as dense numerical vectors based on their meanings.
Measure similarity between embeddings using cosine similarity.
Strength: Captures synonyms and semantic relationships.
Weakness: Requires pre-trained models.

b. Sentence Embeddings (SBERT, Universal Sentence Encoder)

Converts entire phrases or sentences into embeddings.
Suitable for longer, complex text where word order and context matter.
Libraries: sentence-transformers (SBERT), tensorflow_hub (USE).

c. BERT or Transformer-Based Models

Fine-tuned models (e.g., BERT, RoBERTa) can compare semantic similarity by contextualizing words in a sentence.
Common Libraries: HuggingFace Transformers.

4. Rule-Based Matching

a. Exact Match

Directly matches keywords using string equality or a dictionary lookup.
Strength: Simple and fast for structured data.
Weakness: Limited to exact matches.

b. Partial Rules

Use regular expressions or pattern matching to identify specific keywords or structures.
Example: Matching "finance" in "export finance" using regex.

5. Hybrid Approaches

Combine multiple methods to balance efficiency and accuracy:

Example: First, use Jaccard similarity to filter likely matches and then apply a semantic similarity model like BERT for ranking.
Strength: Optimizes computational cost and accuracy.

6. Phonetic Matching

a. Soundex

Encodes words into phonetic codes to match similar-sounding words.
Useful for matching names or terms with spelling variations.

b. Metaphone/Double Metaphone

More advanced phonetic algorithms for sound-based matches.
Common Libraries: phonetics (Python).

7. Custom Scoring

Combine multiple similarity scores (e.g., Levenshtein distance, TF-IDF, embeddings) into a weighted formula.
Example:

Strength: Allows fine-tuning for specific use cases.

Ranking by Accuracy:

After computing similarity scores using one or more of the methods above, the results can be ranked in descending order of the score. For example:

Compute similarity scores between each "actual keyword" and all "seed keywords."
Select the highest-scoring seed keyword for each actual keyword.
Sort results by score to rank the matches.

Choosing the Best Method:

Short Keywords with Typos: Use Levenshtein or Jaro-Winkler.
Phrases with Overlapping Words: Use Jaccard or Cosine Similarity.
Semantically Related Keywords: Use embeddings (Word2Vec, BERT).
Structured Data (e.g., product names): Use rule-based or token-based methods.