.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/precision_selection_example.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_precision_selection_example.py: Precision Selection Strategies - Complete Guide =============================================== This example demonstrates all 5 intelligent precision selection strategies in M3S v0.6.0+, helping you choose the optimal precision level for any use case. .. GENERATED FROM PYTHON SOURCE LINES 8-353 .. rst-class:: sphx-glr-script-out .. code-block:: none ================================================================================ M3S Intelligent Precision Selection - All 5 Strategies ================================================================================ ================================================================================ Strategy 1: Area-Based Selection ================================================================================ Use when: You know the desired cell size in km² Examples: 'I need cells around 10 km²', 'Show me 100 hectare cells' Finding precision for various target areas: -------------------------------------------------------------------------------- Target: 1000.0 km² → Precision: 5 (Actual: 252.903 km², Deviation: 74.7%, Confidence: 0%) Target: 100.0 km² → Precision: 6 (Actual: 36.129 km², Deviation: 63.9%, Confidence: 0%) Target: 10.0 km² → Precision: 7 (Actual: 5.161 km², Deviation: 48.4%, Confidence: 0%) Target: 1.0 km² → Precision: 8 (Actual: 0.737 km², Deviation: 26.3%, Confidence: 12%) Target: 0.1 km² → Precision: 9 (Actual: 0.105 km², Deviation: 5.0%, Confidence: 83%) ================================================================================ Strategy 2: Count-Based Selection ================================================================================ Use when: You want a specific number of cells in a region Examples: 'Split this city into ~100 cells', 'I want about 1000 cells here' Finding precision for Manhattan area with different target counts: -------------------------------------------------------------------------------- Target: 10 cells → Precision: 6 (Actual: ~ 5 cells, Deviation: 50.0%, Confidence: 0%) Target: 50 cells → Precision: 7 (Actual: ~ 40 cells, Deviation: 20.0%, Confidence: 50%) Target: 100 cells → Precision: 8 (Actual: ~ 286 cells, Deviation: 186.0%, Confidence: 0%) Target: 500 cells → Precision: 8 (Actual: ~ 286 cells, Deviation: 42.8%, Confidence: 0%) ================================================================================ Strategy 3: Use-Case Based Selection (Curated Presets) ================================================================================ Use when: You have a common spatial analysis use case Examples: Analyzing neighborhoods, city planning, country-level analysis H3 precision recommendations for common use cases: -------------------------------------------------------------------------------- global → Precision: 0 (Avg area: 4357449.416 km², Confidence: 95%) continental → Precision: 2 (Avg area: 86801.780 km², Confidence: 95%) country → Precision: 3 (Avg area: 12392.264 km², Confidence: 95%) region → Precision: 5 (Avg area: 252.903 km², Confidence: 95%) city → Precision: 7 (Avg area: 5.161 km², Confidence: 95%) neighborhood → Precision: 9 (Avg area: 0.105 km², Confidence: 95%) street → Precision: 11 (Avg area: 0.002 km², Confidence: 95%) building → Precision: 13 (Avg area: 0.000 km², Confidence: 95%) Same use case ('city') across different grid systems: -------------------------------------------------------------------------------- system precision area_km2 confidence geohash 5 2.443 0.95 h3 7 5.161 0.95 s2 16 0.020 0.95 quadkey 13 7.600 0.95 ================================================================================ Strategy 4: Distance-Based Selection ================================================================================ Use when: You care about cell edge length rather than area Examples: 'Cells with ~100m edges', 'I need 1km grid spacing' Finding precision for various target edge lengths: -------------------------------------------------------------------------------- Target: 10000 m → Precision: 5 (Actual: ~9866.4 m, Deviation: 1.3%, Confidence: 96%) Target: 5000 m → Precision: 6 (Actual: ~3729.1 m, Deviation: 25.4%, Confidence: 15%) Target: 1000 m → Precision: 8 (Actual: ~ 532.6 m, Deviation: 46.7%, Confidence: 0%) Target: 500 m → Precision: 8 (Actual: ~ 532.6 m, Deviation: 6.5%, Confidence: 78%) Target: 100 m → Precision: 10 (Actual: ~ 76.0 m, Deviation: 24.0%, Confidence: 20%) Target: 50 m → Precision: 11 (Actual: ~ 27.7 m, Deviation: 44.5%, Confidence: 0%) ================================================================================ Strategy 5: Performance-Based Selection ================================================================================ Use when: You need to balance precision vs computational cost Examples: Real-time applications, limited compute budget, large regions Performance-optimized precision for different scenarios: -------------------------------------------------------------------------------- point_query (budget: 10ms, region: 1000 km²) → Precision: 8, Est. cells: 1356, Est. time: 1.9 ms intersect (budget: 100ms, region: 500 km²) → Precision: 8, Est. cells: 678, Est. time: 72.8 ms conversion (budget: 200ms, region: 100 km²) → Precision: 8, Est. cells: 135, Est. time: 77.5 ms ================================================================================ Practical Example: Combining Strategies in Real Workflow ================================================================================ Scenario: Analyzing neighborhoods in San Francisco -------------------------------------------------------------------------------- 1. Use-case based approach: Precision: 9, Confidence: 95% H3 precision 9 optimized for 'neighborhood' use case (avg cell area: 0.10 km²) 2. Area-based approach (target 0.5 km² cells): Precision: 8, Confidence: 0% Actual area: 0.737 km² 3. Count-based approach (target 200 cells): Precision: 8, Confidence: 0% Estimated cells: 271 4. Distance-based approach (target 500m edges): Precision: 8, Confidence: 78% Actual edge length: 532.6 m Comparing all recommendations: -------------------------------------------------------------------------------- Strategy Precision Confidence Area (km²) Use-case 9 95% 0.105 Area-based 8 0% 0.737 Count-based 8 0% 0.737 Distance-based 8 78% ~0.284 5. Using the recommendation in a query: -------------------------------------------------------------------------------- Executed query with precision 9 Found 10 cells (limited to 10 for display) Average cell area: 0.109 km² Sample cells: 89283080c83ffff - 0.109 km² 89283080c93ffff - 0.109 km² 89283080c97ffff - 0.109 km² 89283082127ffff - 0.109 km² 8928308212fffff - 0.109 km² ================================================================================ Summary: Choosing the Right Strategy ================================================================================ 1. Use-Case Based (Strategy 3): → Best for: Standard spatial analysis tasks → Pros: High confidence, battle-tested presets → Cons: Less control over exact cell size 2. Area-Based (Strategy 1): → Best for: When cell size matters (e.g., land parcels, service areas) → Pros: Precise control over cell area → Cons: May not account for cell count in region 3. Count-Based (Strategy 2): → Best for: When you need specific number of divisions → Pros: Predictable cell count for budgeting/planning → Cons: Cell sizes may vary across region 4. Distance-Based (Strategy 4): → Best for: Grid-like applications, routing, proximity analysis → Pros: Intuitive edge length specification → Cons: Approximation for non-square cells 5. Performance-Based (Strategy 5): → Best for: Real-time apps, constrained compute environments → Pros: Balances detail vs speed → Cons: May sacrifice precision for performance General guidance: - Start with use-case presets (Strategy 3) for common tasks - Use area/distance (1/4) when you have specific size requirements - Use count-based (2) for bounded cell count needs - Use performance-based (5) when speed is critical | .. code-block:: Python import pandas as pd from m3s import GridBuilder, PrecisionSelector pd.set_option("display.max_columns", None) pd.set_option("display.width", 120) print("=" * 80) print("M3S Intelligent Precision Selection - All 5 Strategies") print("=" * 80) print() # Initialize selector for H3 grid system selector = PrecisionSelector("h3") # ============================================================================ # Strategy 1: Area-Based Selection # ============================================================================ print("=" * 80) print("Strategy 1: Area-Based Selection") print("=" * 80) print("\nUse when: You know the desired cell size in km²") print("Examples: 'I need cells around 10 km²', 'Show me 100 hectare cells'") print() # Find precision for different target areas target_areas = [1000.0, 100.0, 10.0, 1.0, 0.1] print("Finding precision for various target areas:") print("-" * 80) for target_area in target_areas: rec = selector.for_area(target_area_km2=target_area, tolerance=0.3) deviation = ( abs(rec.actual_area_km2 - target_area) / target_area * 100 if target_area > 0 else 0 ) print( f"Target: {target_area:8.1f} km² → Precision: {rec.precision:2d} " f"(Actual: {rec.actual_area_km2:8.3f} km², Deviation: {deviation:5.1f}%, " f"Confidence: {rec.confidence:.0%})" ) print() # ============================================================================ # Strategy 2: Count-Based Selection # ============================================================================ print("=" * 80) print("Strategy 2: Count-Based Selection") print("=" * 80) print("\nUse when: You want a specific number of cells in a region") print("Examples: 'Split this city into ~100 cells', 'I want about 1000 cells here'") print() # Manhattan bounding box manhattan_bounds = (40.70, -74.05, 40.85, -73.90) target_counts = [10, 50, 100, 500] print("Finding precision for Manhattan area with different target counts:") print("-" * 80) for target_count in target_counts: rec = selector.for_region_count( bounds=manhattan_bounds, target_count=target_count, tolerance=0.4 ) deviation = ( abs(rec.actual_cell_count - target_count) / target_count * 100 if target_count > 0 else 0 ) print( f"Target: {target_count:4d} cells → Precision: {rec.precision:2d} " f"(Actual: ~{rec.actual_cell_count:4d} cells, Deviation: {deviation:5.1f}%, " f"Confidence: {rec.confidence:.0%})" ) print() # ============================================================================ # Strategy 3: Use-Case Based Selection # ============================================================================ print("=" * 80) print("Strategy 3: Use-Case Based Selection (Curated Presets)") print("=" * 80) print("\nUse when: You have a common spatial analysis use case") print("Examples: Analyzing neighborhoods, city planning, country-level analysis") print() use_cases = [ "global", "continental", "country", "region", "city", "neighborhood", "street", "building", ] print("H3 precision recommendations for common use cases:") print("-" * 80) for use_case in use_cases: rec = selector.for_use_case(use_case) print( f"{use_case:15s} → Precision: {rec.precision:2d} " f"(Avg area: {rec.actual_area_km2:12.3f} km², Confidence: {rec.confidence:.0%})" ) print() # Compare across grid systems print("Same use case ('city') across different grid systems:") print("-" * 80) systems = ["geohash", "h3", "s2", "quadkey"] city_recs = [] for system in systems: sel = PrecisionSelector(system) rec = sel.for_use_case("city") city_recs.append( { "system": system, "precision": rec.precision, "area_km2": rec.actual_area_km2, "confidence": rec.confidence, } ) df = pd.DataFrame(city_recs) print(df.to_string(index=False)) print() # ============================================================================ # Strategy 4: Distance-Based Selection # ============================================================================ print("=" * 80) print("Strategy 4: Distance-Based Selection") print("=" * 80) print("\nUse when: You care about cell edge length rather than area") print("Examples: 'Cells with ~100m edges', 'I need 1km grid spacing'") print() target_distances = [10000, 5000, 1000, 500, 100, 50] # meters print("Finding precision for various target edge lengths:") print("-" * 80) for target_dist in target_distances: rec = selector.for_distance(edge_length_m=target_dist, tolerance=0.3) deviation = ( abs(rec.edge_length_m - target_dist) / target_dist * 100 if target_dist > 0 else 0 ) print( f"Target: {target_dist:6d} m → Precision: {rec.precision:2d} " f"(Actual: ~{rec.edge_length_m:6.1f} m, Deviation: {deviation:5.1f}%, " f"Confidence: {rec.confidence:.0%})" ) print() # ============================================================================ # Strategy 5: Performance-Based Selection # ============================================================================ print("=" * 80) print("Strategy 5: Performance-Based Selection") print("=" * 80) print("\nUse when: You need to balance precision vs computational cost") print("Examples: Real-time applications, limited compute budget, large regions") print() # Different operation types with time budgets scenarios = [ ("point_query", 10.0, 1000.0), # Fast operation, large region ("intersect", 100.0, 500.0), # Medium operation, medium region ("conversion", 200.0, 100.0), # Expensive operation, small region ] print("Performance-optimized precision for different scenarios:") print("-" * 80) for op_type, time_budget, region_size in scenarios: rec = selector.for_performance( operation_type=op_type, time_budget_ms=time_budget, region_size_km2=region_size ) print( f"{op_type:15s} (budget: {time_budget:5.0f}ms, region: {region_size:6.0f} km²)" ) print( f" → Precision: {rec.precision:2d}, " f"Est. cells: {rec.metadata['estimated_cells']:5d}, " f"Est. time: {rec.metadata['estimated_time_ms']:5.1f} ms" ) print() # ============================================================================ # Practical Example: Combining Strategies # ============================================================================ print("=" * 80) print("Practical Example: Combining Strategies in Real Workflow") print("=" * 80) print() print("Scenario: Analyzing neighborhoods in San Francisco") print("-" * 80) # Try multiple strategies and compare sf_bounds = (37.70, -122.52, 37.82, -122.35) print("\n1. Use-case based approach:") rec1 = selector.for_use_case("neighborhood") print(f" Precision: {rec1.precision}, Confidence: {rec1.confidence:.0%}") print(f" {rec1.explanation}") print("\n2. Area-based approach (target 0.5 km² cells):") rec2 = selector.for_area(target_area_km2=0.5) print(f" Precision: {rec2.precision}, Confidence: {rec2.confidence:.0%}") print(f" Actual area: {rec2.actual_area_km2:.3f} km²") print("\n3. Count-based approach (target 200 cells):") rec3 = selector.for_region_count(bounds=sf_bounds, target_count=200) print(f" Precision: {rec3.precision}, Confidence: {rec3.confidence:.0%}") print(f" Estimated cells: {rec3.actual_cell_count}") print("\n4. Distance-based approach (target 500m edges):") rec4 = selector.for_distance(edge_length_m=500) print(f" Precision: {rec4.precision}, Confidence: {rec4.confidence:.0%}") print(f" Actual edge length: {rec4.edge_length_m:.1f} m") print("\nComparing all recommendations:") print("-" * 80) comparison = pd.DataFrame( [ { "Strategy": "Use-case", "Precision": rec1.precision, "Confidence": f"{rec1.confidence:.0%}", "Area (km²)": f"{rec1.actual_area_km2:.3f}", }, { "Strategy": "Area-based", "Precision": rec2.precision, "Confidence": f"{rec2.confidence:.0%}", "Area (km²)": f"{rec2.actual_area_km2:.3f}", }, { "Strategy": "Count-based", "Precision": rec3.precision, "Confidence": f"{rec3.confidence:.0%}", "Area (km²)": f"{rec3.metadata.get('region_area_km2', 0) / rec3.actual_cell_count:.3f}", }, { "Strategy": "Distance-based", "Precision": rec4.precision, "Confidence": f"{rec4.confidence:.0%}", "Area (km²)": "~" + f"{(rec4.edge_length_m/1000)**2:.3f}", }, ] ) print(comparison.to_string(index=False)) print("\n5. Using the recommendation in a query:") print("-" * 80) # Use the highest-confidence recommendation best_rec = max([rec1, rec2, rec3, rec4], key=lambda r: r.confidence) result = ( GridBuilder.for_system("h3") .with_auto_precision(best_rec) .in_bbox(37.75, -122.45, 37.80, -122.40) # Small SF area .limit(10) .execute() ) print(f"\nExecuted query with precision {best_rec.precision}") print(f"Found {len(result)} cells (limited to 10 for display)") print( f"Average cell area: {sum(c.area_km2 for c in result.many) / len(result):.3f} km²" ) # Display cells print("\nSample cells:") for cell in result.many[:5]: print(f" {cell.identifier} - {cell.area_km2:.3f} km²") print() print("=" * 80) print("Summary: Choosing the Right Strategy") print("=" * 80) print( """ 1. Use-Case Based (Strategy 3): → Best for: Standard spatial analysis tasks → Pros: High confidence, battle-tested presets → Cons: Less control over exact cell size 2. Area-Based (Strategy 1): → Best for: When cell size matters (e.g., land parcels, service areas) → Pros: Precise control over cell area → Cons: May not account for cell count in region 3. Count-Based (Strategy 2): → Best for: When you need specific number of divisions → Pros: Predictable cell count for budgeting/planning → Cons: Cell sizes may vary across region 4. Distance-Based (Strategy 4): → Best for: Grid-like applications, routing, proximity analysis → Pros: Intuitive edge length specification → Cons: Approximation for non-square cells 5. Performance-Based (Strategy 5): → Best for: Real-time apps, constrained compute environments → Pros: Balances detail vs speed → Cons: May sacrifice precision for performance General guidance: - Start with use-case presets (Strategy 3) for common tasks - Use area/distance (1/4) when you have specific size requirements - Use count-based (2) for bounded cell count needs - Use performance-based (5) when speed is critical """ ) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 22.953 seconds) .. _sphx_glr_download_auto_examples_precision_selection_example.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: precision_selection_example.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: precision_selection_example.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: precision_selection_example.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_