Csmart Methodology for Weight Estimation

The sample weight is an important variable in the coffee grading method. Most grading methods work with a total weight per sample but use a point system based on defect occurrences.

Csmart Digit is a computer vision-based solution, that does not rely on scales or weight measurements. Instead, it utilizes indirect inference techniques to estimate these variables.

Additionally, ensuring that all seeds in the hopper are registered would require a restrictive and constant type of computer to run the analysis, complicating the software's use on different computers and laptop configurations. To address this, the Csmart methodology estimates the weight based on an expected count per screen size for a fixed weight or a count-per-screen-size table. This approach allows for more flexible and accessible use of the software across various devices, disregarding the necessity of recording every single seed.

To estimate the weight of a sample in the Csmart Digit methodology, the following mathematical approach is used:

Define Screen Size Categories

Let S = {s1, s2, ..., sn} be the set of screen size categories. This set is determined arbitrarily, based on commonly used sieves, and defined by the user when creating the AI model. An example of screen size categories might be S = {12, 13, 14, 15, 16, 17, 18, 19}. Screen sizes that fall below the minimum or above the maximum values will be grouped within the boundary values of the smallest and largest categories, respectively.

Expected Seed Count per Screen Size for a Specific Weight

For each screen size si, there is an expected count of seeds E(si) based on a known sample weight, denoted as Tweight. This can be represented as E = {E(s1), Es2), ..., E(sn)}. The target weight is typically 300 grams, but it can have any value and depends on the level of accuracy the user requires for their model. Each E(si) must be related to the same Wtarget

Example of Expected Seed Count

For a specific dataset, an example of the expected seed count for a 300g sample might look like this:

Tweight = 300g

E100g= { 12: 3300, 13: 2970, 14: 2910, 15: 2780, 16: 2540, 17: 2250, 18: 1920 }

This information is collected once and is tied to a specific AI model. It represents the occurrence of seeds by weight and screen size for that particular coffee dataset.

Observed Seed Percentage per Screen Size

Let O = {O(s1), O(s2), ..., Osn} represent the actual observed count (or area) of seeds for each screen size after the analysis. Dividing each O(si) by the total occurrence will result in the percentage for each screen size in the analyzed sample. This can be calculated as:

D'(si) = O(si) divided by the sum of O(sj) for j from 1 to n

where D'(si) represents the observed seed percentage for screen size si.

Required Seed for the Expected Weight

With the expected seed count per screen size and the observed seed percentage, it is now possible to calculate the amount of seed to reach the expected weight, defined before.

required seeds = sum(E(si) multiplied by D'(si) for i from 1 to n)

Calculate Weight Factor

Dividing the total seed count by the required seeds will result in the weight factor of the sample or other words, the number of seeds missing or extrapolating the required weight.

weight factor = total seeds / required seeds

Estimate Total Weight

To estimate the total weight of the sample, it is necessary to multiply the target weight by the weight factor:

estimated weight = Tweight * by weight factor

Weight Factor for AI Models

As one may notice, the target weight is not tied to any specific grading methodology but is instead related to the count of seeds for a given weight and screen size. Different grading methodologies require distinct target weights. For example, the SCA methodology uses 350 grams, while the COB system uses 300 grams. To accurately weigh defects and calculate equivalent defects, the required weight for each methodology must be divided by the estimated weight, resulting in a unique weight factor for each method. This allows the AI model to correctly apply different grading standards based on these calculated weight factors.

Classes With Different Densities

It is also possible to create separate count-per-screen-size entries for specific classes. For example, considering that the density of husks is significantly lower than that of coffee seeds, one may account for this by generating a separate entry for this class. This can be done for all screen sizes or, if the complexity is too high, represented as a single figure.

On the other hand, certain classes can be excluded from weight calculations to better align the sample with a production environment. For instance, silver skins and husks may appear in large quantities in sample hullers but are unlikely to occur in production dry mills. In such cases, users can simply note which classes should be disregarded when creating the model.

Example of Calculation Using Csmart Approach

Tweight = 100g

Expected Distribution = {10: 1246, 11: 1175, 12: 1103, 13: 990, 14: 970, 15: 925, 16: 845, 17: 745, 18: 640}

Observed seed count per screen size

O = {10: 32, 11: 12, 12: 22, 13: 71, 14: 139, 15: 172, 16: 650, 17: 823, 18: 474}

Observed seed percentage per screen size

Total Observed Seeds = 2395

D'(10) = 32 / 2395 = 1.34%

D'(11) = 12 / 2395 = 0.5%

D'(12) = 22 / 2395 = 0.92%

D'(13) = 71 / 2395 = 2.96%

D'(14) = 139 / 2395 = 5.8%

D'(15) = 172 / 2395 = 7.18%

D'(16) = 650 / 2395 = 27.13%

D'(17) = 823 / 2395 = 34.37%

D'(18) = 474 / 2395 = 19.79%

Required Seeds for the Expected Weight

Calculate the amount of seeds needed to reach the expected weight:

Required Seeds = 1246 * 0.0134 + 1175 * 0.005 + 1103 * 0.0092 + 990 * 0.0296 + 970 * 0.058 + 925 * 0.0718 + 845 * 0.2713 + 745 * 0.3437 + 640 * 0.1979

Required Seeds = 16.6964 + 5.875 + 10.5476 + 29.304 + 56.26 + 66.665 + 229.0085 + 255.4565 + 126.656

Required Seeds = 797 to reach 100g, considering the observed distribution

Calculate Weight Factor

Calculate the weight factor:

Weight Factor = Total Seeds / Required Seeds

Weight Factor = 2395 / 796.469 = 3.007

Estimate Total Weight

Estimate the total weight of the sample:

Estimated Weight = Target Weight * Weight Factor

Estimated Weight = 100 * 3.007 = 300.7 grams

Last updated