<div dir="ltr">Nice work Stephen and so fast! <div><br></div><div>One general comment: should we do this in the wiki? (Or some way besides email chain?)</div><div><br></div><div>Specific comment: Shall we expose "monthly cost of an unlimited transit pass at $2.75"? I'm in favor of that.</div><div><br></div><div>Here is a stab at a viz. Not too exciting. Note the employer pick list at the bottom. More to follow.</div><div><a href="https://public.tableau.com/shared/W54KCSTXD?:display_count=yes&:origin=viz_share_link">https://public.tableau.com/shared/W54KCSTXD?:display_count=yes&:origin=viz_share_link</a><br></div><div><br></div><div>(Again, this viz is shareable by the link only. It's otherwise not publicly visible/searchable.)</div><div><br></div><div>-Jim</div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 8, 2019 at 5:26 PM Stephen DeSanto <<a href="mailto:rachidian@gmail.com">rachidian@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi everyone,</div><div><br></div><div>I've taken a first pass at the data schema for showing employer transit benefits in our upcoming web app. In this draft, each employer record is represented as follows:</div><div><br></div><div>{<br> "employer": string,<br> "industry": [string],<br> "neighborhood": [string],<br> "alias": [string],<br> "rating": int,</div><div> "description": string<br></div><div> "badges": [string]<br>}</div><div><br></div><div><b>Employer</b> is a plain text string.</div><div><b>Industry</b> is a list of strings (or a single string, if we want to limit one employer = one industry).</div><div><b>Neighborhood</b> is treated similarly to industry<br></div><div><b>Alias</b> is a list of other names for the same company. For example, <br></div><div><b>Rating</b> is a numerical scale that represents the "worker's monthly cost of an unlimited transit pass". The scale provided during the meeting went from "4 leaves" to "brown tortoise"; aligning to the leaves, that gives us a scale of [-1, 0, 1, 2, 3, 4]. We could adjust this up to 0-5, or lump "piggy bank" and "brown tortoise" in the same rating.</div><div><b>Description</b> is a string that describes the employer's transit benefits, i.e. why they got the rating they did.<br></div><div><b>Badges</b> is a list of strings that represent any additional categories we want to assign to a company (e.g. "industry leader", "polluter").</div><div><br></div><div>We can make changes to this schema if it makes it easier to work with our underlying data visualization platform (e.g. Tableau? DataTables?), but hopefully this is a suitable starting place.</div><div><br></div><div>As an example, take a hypothetical record for Seattle Coffee Works.</div><div><br></div><div>{</div><div> "employer": "Seattle Coffee Works",<br> "industry": ["restaurant"],<br> "neighborhood": ["cbd", "ballard"],<br> "alias": ["Ballard Coffee Works"],<br> "rating": 4,</div><div> "description": "Provides 100% ORCA Passport subsidy."<br></div><div> "badges": ["leader"]</div><div>}<br></div><div><br></div><div><b>Where Our Data Lives (For Now)</b><br></div><div><br></div><div>I've also taken a rough chop at getting started with the data. Here, I've just taken the raw list of ORCA Business Passport employers and assigned a score based on their subsidy percentage, as an example:</div><div><br></div><div><a href="https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing" target="_blank">https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing</a></div><div><br></div><div>The spreadsheet contains columns for each item of the employer record, as well as some additional columns to record the raw data we have on file for that employer, so we can use that data to automatically or manually determine an employer's rating.<br></div><div><br></div><div>If we have data from other sources not listed (e.g. survey data, City of Seattle data), the "source_" columns can be renamed or added to represent that source's data. For example, if I want to add data from the TRU survey, I might rename "__source_b" to "__TRU Survey", then include results from that survey in that column for each company. (The columns beginning with two underscores are ones I don't expect to be publicly available.)</div><div><br></div><div>PBworks feels really inadequate for editing large data sets, and I don't know where else to put it, so it's living in Google Sheets for now. Set to read-only with the link, for now, but please request editing permissions so you can add stuff to the sheet.<br></div><div><br></div><div>Currently, my expectation is that the spreadsheet will be hand-edited in Google Sheets, and then when we're ready to put live data in the web app, we can export the sheet to a flat file, which we can then import into a format appropriate for the website (big ol' JSON file, database, whatever). Manual process, but probably fine for a project of this scale; I'm open to alternatives. </div><div><br></div><div><b>Things To Do Next</b><br></div><div><br></div><div>Aside from the ORCA Passport data and the data we collected through TRU survey / legwork (on PBworks), do we have any other data sources that would provide context for a score?</div><div><br></div><div>For the data sources we have, we'll have to start filling out the rest of the spreadsheet, I guess?<br></div><div><br></div><div>Also, we will need to determine:</div><div>a) master list of "industries" we want to support, and</div><div>b) "industry" field(s) for each employer</div><div>c) "neighborhood" field(s) for each employer we don't have one for (or being more precise than what I have now)<br></div><div>d) which companies get tagged with which badges</div><div><br></div><div>Hope that helps.<br></div><div><br></div><div>In solidarity,</div><div><br></div><div>Stephen<br></div><div><br></div></div>
<br>
</blockquote></div>