[TRU Research] Web App Data Schema
Jim Walseth
jim.walseth at gmail.com
Sat Aug 10 16:31:56 PDT 2019
Nice work Stephen and so fast!
One general comment: should we do this in the wiki? (Or some way besides
email chain?)
Specific comment: Shall we expose "monthly cost of an unlimited transit
pass at $2.75"? I'm in favor of that.
Here is a stab at a viz. Not too exciting. Note the employer pick list at
the bottom. More to follow.
https://public.tableau.com/shared/W54KCSTXD?:display_count=yes&:origin=viz_share_link
(Again, this viz is shareable by the link only. It's otherwise not publicly
visible/searchable.)
-Jim
On Thu, Aug 8, 2019 at 5:26 PM Stephen DeSanto <rachidian at gmail.com> wrote:
> Hi everyone,
>
> I've taken a first pass at the data schema for showing employer transit
> benefits in our upcoming web app. In this draft, each employer record is
> represented as follows:
>
> {
> "employer": string,
> "industry": [string],
> "neighborhood": [string],
> "alias": [string],
> "rating": int,
> "description": string
> "badges": [string]
> }
>
> *Employer* is a plain text string.
> *Industry* is a list of strings (or a single string, if we want to limit
> one employer = one industry).
> *Neighborhood* is treated similarly to industry
> *Alias* is a list of other names for the same company. For example,
> *Rating* is a numerical scale that represents the "worker's monthly cost
> of an unlimited transit pass". The scale provided during the meeting went
> from "4 leaves" to "brown tortoise"; aligning to the leaves, that gives us
> a scale of [-1, 0, 1, 2, 3, 4]. We could adjust this up to 0-5, or lump
> "piggy bank" and "brown tortoise" in the same rating.
> *Description* is a string that describes the employer's transit benefits,
> i.e. why they got the rating they did.
> *Badges* is a list of strings that represent any additional categories we
> want to assign to a company (e.g. "industry leader", "polluter").
>
> We can make changes to this schema if it makes it easier to work with our
> underlying data visualization platform (e.g. Tableau? DataTables?), but
> hopefully this is a suitable starting place.
>
> As an example, take a hypothetical record for Seattle Coffee Works.
>
> {
> "employer": "Seattle Coffee Works",
> "industry": ["restaurant"],
> "neighborhood": ["cbd", "ballard"],
> "alias": ["Ballard Coffee Works"],
> "rating": 4,
> "description": "Provides 100% ORCA Passport subsidy."
> "badges": ["leader"]
> }
>
> *Where Our Data Lives (For Now)*
>
> I've also taken a rough chop at getting started with the data. Here, I've
> just taken the raw list of ORCA Business Passport employers and assigned a
> score based on their subsidy percentage, as an example:
>
>
> https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing
>
> The spreadsheet contains columns for each item of the employer record, as
> well as some additional columns to record the raw data we have on file for
> that employer, so we can use that data to automatically or manually
> determine an employer's rating.
>
> If we have data from other sources not listed (e.g. survey data, City of
> Seattle data), the "source_" columns can be renamed or added to represent
> that source's data. For example, if I want to add data from the TRU survey,
> I might rename "__source_b" to "__TRU Survey", then include results from
> that survey in that column for each company. (The columns beginning with
> two underscores are ones I don't expect to be publicly available.)
>
> PBworks feels really inadequate for editing large data sets, and I don't
> know where else to put it, so it's living in Google Sheets for now. Set to
> read-only with the link, for now, but please request editing permissions so
> you can add stuff to the sheet.
>
> Currently, my expectation is that the spreadsheet will be hand-edited in
> Google Sheets, and then when we're ready to put live data in the web app,
> we can export the sheet to a flat file, which we can then import into a
> format appropriate for the website (big ol' JSON file, database, whatever).
> Manual process, but probably fine for a project of this scale; I'm open to
> alternatives.
>
> *Things To Do Next*
>
> Aside from the ORCA Passport data and the data we collected through TRU
> survey / legwork (on PBworks), do we have any other data sources that would
> provide context for a score?
>
> For the data sources we have, we'll have to start filling out the rest of
> the spreadsheet, I guess?
>
> Also, we will need to determine:
> a) master list of "industries" we want to support, and
> b) "industry" field(s) for each employer
> c) "neighborhood" field(s) for each employer we don't have one for (or
> being more precise than what I have now)
> d) which companies get tagged with which badges
>
> Hope that helps.
>
> In solidarity,
>
> Stephen
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.transitriders.org/pipermail/research/attachments/20190810/baab3955/attachment.html>
More information about the Research
mailing list