<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">Regarding where to have this discussion
- I'm just gonna continue the email chain cause I haven't followed
where to put the discussion on the wiki, but someone feel free to
steer it over there if we want to!</div>
<div class="moz-cite-prefix"><br>
A brief update regarding establishing a larger list of employers
to include in the dataset: basic contact information for all
seattle businesses, sorted by the North American Industry
Classification System, is available at <a
href="https://web6.seattle.gov/fas/slimbizsearch/ResultsPage.aspx?NAICList=Top100">https://web6.seattle.gov/fas/slimbizsearch/ResultsPage.aspx?NAICList=Top100</a>,
but it's a huge list of course, with no info on number of
employees or revenue to filter out the smaller ones. Still, I did
send off an email about getting a copy of the database just for
purposes of cross-referencing names and such.<br>
On 8/10/19 6:42 PM, Katie Wilson wrote:<br>
</div>
<blockquote type="cite"
cite="mid:09AFFEF2-2BC9-4B67-9BDE-D526954EF246@transitriders.org">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
For “neighborhood” I think it makes sense to use the <a
href="https://www.seattle.gov/transportation/projects-and-programs/programs/transportation-options-program/commute-trip-reduction-program/draft-2019-2023-networks-and-targets"
class="" moz-do-not-send="true">“CTR Network Areas” as defined
here</a>.<br class="">
<div class=""><br class="">
</div>
<div class="">For “industry” I think it makes sense to use the <a
href="https://www.seattle.gov/Documents/Departments/SDOT/TransportationOptionsProgram/CTR_Draft_Strategic_Plan_Jan2019.pdf"
class="" moz-do-not-send="true">“Employment Sector” categories
listed on Page 12 of this CTR strategic plan.</a></div>
<div class=""><br class="">
</div>
<div class="">
<div class="">On the ratings, I think it does make sense to
lump "piggy bank" and "brown tortoise" in the same rating (0),
and then add a tortoise badge for employers that aren’t even
doing the pre-tax thing.</div>
<div class=""><br class="">
</div>
<div class="">Another simplification option to consider would be
to lump together 3 and 4 leaves. But let’s leave them separate
for now and depending on how things shake out we can easily
combine them later.</div>
</div>
<div class=""><br class="">
</div>
<div class="">We don’t have any major sources of data on what
benefits employers provide other than:</div>
<div class="">— <a
href="https://seattletransitpasses-research.pbworks.com/w/page/133438080/First%20Public%20Records%20Request"
class="" moz-do-not-send="true">Metro public disclosure
request spreadsheet</a></div>
<div class="">— Our commute survey</div>
<div class="">— Info gleaned online from company websites, asking
around, glassdoor etc (what I’ve found I’ve added to the <a
href="https://seattletransitpasses-research.pbworks.com/w/page/132177123/Employers"
class="" moz-do-not-send="true">relevant tables in the wiki</a>,
on CTR employers and “potential poster children” and “likely
target assessment” and “hotels”)</div>
<div class=""><br class="">
</div>
<div class="">Maybe it makes sense to have another string
indicating sufficient certainty — when we have two sources, or
one very reliable source, we enter an X or whatever, and that
gives us the green light to display that data. Also it may not
make sense to put a lot of work into categorizing employers into
Network Area and Employment Sector until we have reliable data
on what benefits they’re offering.</div>
<div class=""><br class="">
</div>
<div class="">Speaking of Seattle Coffee Works, I spoke with their
HR person a few months ago and actually employees have to pay
$20/month (pre-tax $) if they want an ORCA card. Still a great
deal but not 100% subsidy as reported in the Metro data— which,
I then learned, is self-reported by the company. Metro only
knows that all those companies are signed up for the Passport
program. I noted the real situation <a
href="https://seattletransitpasses-research.pbworks.com/w/page/133439169/Potential%20Poster%20Children"
class="" moz-do-not-send="true">on this page</a>. Anyway, the
point is we should probably crosscheck the Metro data as much as
we can with our survey or other sources of information.
<div class=""><br class="">
</div>
<div class="">(Also speaking of Seattle Coffee Works they have
locations in <a
href="https://www.seattlecoffeeworks.com/our-cafes.aspx"
class="" moz-do-not-send="true">Capitol Hill & Cascade
too</a>. From talking with the HR person I’m pretty sure all
are include in their passport program, and the employees swap
around a lot from location to location. They probably use the
Ballard location as home base for transit pass purposes since
that’s the least expensive zone.)</div>
</div>
<div class=""><br class="">
</div>
<div class="">One project would be to come up with a list of
employers that have name recognition (or that we are interested
in for some other reason) and put a little work into attaining
sufficient certainty. If we posted the list to a page and put a
call out on social media and email I bet we’d get some answers.</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">On Aug 8, 2019, at 5:26 PM, Stephen DeSanto
<<a href="mailto:rachidian@gmail.com" class=""
moz-do-not-send="true">rachidian@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">
<div class="">Hi everyone,</div>
<div class=""><br class="">
</div>
<div class="">I've taken a first pass at the data
schema for showing employer transit benefits in our
upcoming web app. In this draft, each employer
record is represented as follows:</div>
<div class=""><br class="">
</div>
<div class="">{<br class="">
"employer": string,<br class="">
"industry": [string],<br class="">
"neighborhood": [string],<br class="">
"alias": [string],<br class="">
"rating": int,</div>
<div class=""> "description": string<br class="">
</div>
<div class=""> "badges": [string]<br class="">
}</div>
<div class=""><br class="">
</div>
<div class=""><b class="">Employer</b> is a plain text
string.</div>
<div class=""><b class="">Industry</b> is a list of
strings (or a single string, if we want to limit one
employer = one industry).</div>
<div class=""><b class="">Neighborhood</b> is treated
similarly to industry<br class="">
</div>
<div class=""><b class="">Alias</b> is a list of other
names for the same company. For example, <br
class="">
</div>
<div class=""><b class="">Rating</b> is a numerical
scale that represents the "worker's monthly cost of
an unlimited transit pass". The scale provided
during the meeting went from "4 leaves" to "brown
tortoise"; aligning to the leaves, that gives us a
scale of [-1, 0, 1, 2, 3, 4]. We could adjust this
up to 0-5, or lump "piggy bank" and "brown tortoise"
in the same rating.</div>
<div class=""><b class="">Description</b> is a string
that describes the employer's transit benefits, i.e.
why they got the rating they did.<br class="">
</div>
<div class=""><b class="">Badges</b> is a list of
strings that represent any additional categories we
want to assign to a company (e.g. "industry leader",
"polluter").</div>
<div class=""><br class="">
</div>
<div class="">We can make changes to this schema if it
makes it easier to work with our underlying data
visualization platform (e.g. Tableau? DataTables?),
but hopefully this is a suitable starting place.</div>
<div class=""><br class="">
</div>
<div class="">As an example, take a hypothetical
record for Seattle Coffee Works.</div>
<div class=""><br class="">
</div>
<div class="">{</div>
<div class=""> "employer": "Seattle Coffee Works",<br
class="">
"industry": ["restaurant"],<br class="">
"neighborhood": ["cbd", "ballard"],<br class="">
"alias": ["Ballard Coffee Works"],<br class="">
"rating": 4,</div>
<div class=""> "description": "Provides 100% ORCA
Passport subsidy."<br class="">
</div>
<div class=""> "badges": ["leader"]</div>
<div class="">}<br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><b class="">Where Our Data Lives (For
Now)</b><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">I've also taken a rough chop at getting
started with the data. Here, I've just taken the raw
list of ORCA Business Passport employers and
assigned a score based on their subsidy percentage,
as an example:</div>
<div class=""><br class="">
</div>
<div class=""><a
href="https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing"
class="" moz-do-not-send="true">https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing</a></div>
<div class=""><br class="">
</div>
<div class="">The spreadsheet contains columns for
each item of the employer record, as well as some
additional columns to record the raw data we have on
file for that employer, so we can use that data to
automatically or manually determine an employer's
rating.<br class="">
</div>
<div class=""><br class="">
</div>
<div class="">If we have data from other sources not
listed (e.g. survey data, City of Seattle data), the
"source_" columns can be renamed or added to
represent that source's data. For example, if I want
to add data from the TRU survey, I might rename
"__source_b" to "__TRU Survey", then include results
from that survey in that column for each company.
(The columns beginning with two underscores are ones
I don't expect to be publicly available.)</div>
<div class=""><br class="">
</div>
<div class="">PBworks feels really inadequate for
editing large data sets, and I don't know where else
to put it, so it's living in Google Sheets for now.
Set to read-only with the link, for now, but please
request editing permissions so you can add stuff to
the sheet.<br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Currently, my expectation is that the
spreadsheet will be hand-edited in Google Sheets,
and then when we're ready to put live data in the
web app, we can export the sheet to a flat file,
which we can then import into a format appropriate
for the website (big ol' JSON file, database,
whatever). Manual process, but probably fine for a
project of this scale; I'm open to alternatives. </div>
<div class=""><br class="">
</div>
<div class=""><b class="">Things To Do Next</b><br
class="">
</div>
<div class=""><br class="">
</div>
<div class="">Aside from the ORCA Passport data and
the data we collected through TRU survey / legwork
(on PBworks), do we have any other data sources that
would provide context for a score?</div>
<div class=""><br class="">
</div>
<div class="">For the data sources we have, we'll have
to start filling out the rest of the spreadsheet, I
guess?<br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Also, we will need to determine:</div>
<div class="">a) master list of "industries" we want
to support, and</div>
<div class="">b) "industry" field(s) for each employer</div>
<div class="">c) "neighborhood" field(s) for each
employer we don't have one for (or being more
precise than what I have now)<br class="">
</div>
<div class="">d) which companies get tagged with which
badges</div>
<div class=""><br class="">
</div>
<div class="">Hope that helps.<br class="">
</div>
<div class=""><br class="">
</div>
<div class="">In solidarity,</div>
<div class=""><br class="">
</div>
<div class="">Stephen<br class="">
</div>
<div class=""><br class="">
</div>
</div>
<br class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">
</pre>
</blockquote>
</body>
</html>